RESPONSE-CONSEQUENCE CONTINGENCY DISCRIMINABILITY WHEN POSITIVE AND NEGATIVE REINFORCEMENT COMPETE IN CONCURRENT SCHEDULES Except where reference is made to the work of others, the work described in this dissertation is my own or was done in collaboration with my advisory committee. This dissertation does not include proprietary or classified information. _____________________________________ Michael Austin Magoon Certificate of Approval: ______________________________ ______________________________ Thomas S. Critchfield M. Christopher Newland, Chair Associate Professor Alumni Professor Psychology Psychology ______________________________ ______________________________ Martha C. Escobar James F. McCoy Assistant Professor Associate Professor Psychology Psychology ______________________________ Stephen L. McFarland Dean Graduate School RESPONSE-CONSEQUENCE CONTINGENCY DISCRIMINABILITY WHEN POSITIVE AND NEGATIVE REINFORCEMENT COMPETE IN CONCURRENT SCHEDULES Michael Austin Magoon A Dissertation Submitted to the Graduate Faculty of Auburn University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Auburn, Alabama December 16, 2005 iii RESPONSE-CONSEQUENCE CONTINGENCY DISCRIMINABILITY WHEN POSITIVE AND NEGATIVE REINFORCEMENT COMPETE IN CONCURRENT SCHEDULES Michael Austin Magoon Permission is granted to Auburn University to make copies of this dissertation at its discretion, upon request of individuals or institutions, and at their expense. The author reserves all publication rights. ___________________________ Signature of Author ___________________________ Date of Graduation iv DISSERTATION ABSTRACT RESPONSE-CONSEQUENCE CONTINGENCY DISCRIMINABILITY WHEN POSITIVE AND NEGATIVE REINFORCEMENT COMPETE IN CONCURRENT SCHEDULES Michael Austin Magoon Doctor of Philosophy, December 16, 2005 (M.S., Auburn University, 2001) (B.A., The Florida State University, 1996) (A.A., Indian River Community College, 1994) 208 Typed Pages Directed by M. Christopher Newland Four experiments were conducted to test the qualitative prediction of contingency discriminability theory that any difference in type between the consequences of concurrently available discriminated operants will heighten response-consequence contingency discriminability and increase behavioral sensitivity to those consequences. In each experiment, three human subjects working under two-ply concurrent schedules of variable-cycle money reinforcement completed four experimental conditions in each of two phases. The phases differed from each other only in terms of the type of reinforcement contingency and/or the way in which the consequent stimuli were presented. Primary analyses were carried out via the standard conventions of the v generalized matching relation. In Experiment 1, one phase consisted of concurrent schedules of positive versus positive reinforcement and the other consisted of concurrent schedules of positive versus negative reinforcement (avoidance). All subjects demonstrated steeper matching function slopes in the phase that arranged concurrently available different types of consequences. Experiments 2 and 3 were designed to test the necessity and/or sufficiency of two features that distinguish positive from negative reinforcement in Experiment 1: money gain versus money loss; and the presentation of feedback after subjects met versus failed to meet the reinforcement contingency. Both features were sufficient, but neither was necessary, to produce a slope effect similar to that seen in Experiment 1. Experiment 4 differed from Experiments 1-3 by arranging concurrently available same-type consequences within each phase, but having the types differ across phases. No systematic slope effects were observed in Experiment 4. Overall, the results supported the qualitative prediction of contingency discriminability theory that any difference in type between the consequences of concurrently available discriminated operants will heighten response-consequence contingency discriminability and increase behavioral sensitivity to those consequences. These results are discussed in terms of the potential utility of the present methods to advance further research into the effects of choice-controlling variables other than reinforcement frequency, and in terms of some issues that must be resolved prior to doing so. vi ACKNOWLEDGEMENTS The author would like to extend warmest thanks to Tom Critchfield for offering the opportunity for this research to take place at Illinois State University. Without his conceptual, editorial, financial, and personal support, this dissertation never would have been completed. Greatest thanks are also offered to Chris Newland for seeing it through to the end, though he was under no obligation to do so, and for providing some of his ICRE funds for the research. Thanks are also due to Martha Escobar and Jim McCoy for agreeing to serve as committee members and for expanding the intellectual discourse. Pete Johnson, the outside reader, also deserves special thanks for working hard to expedite the dissertation completion process at the end. In addition to those directly involved with development of the document, the author would like to thank Barry Burkhart, Chair of the Department of Psychology at Auburn University, for helping to secure a departmental graduate research grant, and David Barone, Chair of the Department of Psychology at Illinois State University, for offering the author a teaching position during a portion of his time there. Many thanks also to Mei-Shio Jang for writing the computer program and for helping to extract the relevant data, to Dustin Merrill for his assistance with some portions of data collection, and to Chris Krageloh for his advanced empirical and conceptual help. Finally, thanks to Thane Bryant and Alice Carroll for their usually unheralded help with administrative processes and paperwork. Their help was particularly needed given that this research was carried out from a distance, and therefore is especially appreciated. vii Style manual or journal used: American Psychological Association. (2001). Publication manual of the American Psychological Association (5 th ed.). Washington, DC: Author. Computer software used: QuickBasic 4.5 Microsoft Word 2002 SP3 Microsoft Excel 2002 SP3 SPSS (Release 11.5.0) viii TABLE OF CONTENTS LIST OF TABLES AND FIGURES............................................................................... xiv CHAPTER 1: LITERATURE REVIEW Introduction............................................................................................................ 1 Concurrent Schedules and the Matching Relation................................................. 2 The strict matching relation........................................................................ 2 The generalized matching relation............................................................. 6 Contingency Discriminability Theory.................................................................. 10 Classical signal detection theory.............................................................. 10 Initial integration...................................................................................... 12 Testing the models and their assumptions................................................ 16 The Differential Outcomes Effect........................................................................ 25 The utility of the GMR in testing the DOE.............................................. 27 The Matching Relation and Negative Reinforcement.......................................... 28 Research with nonhumans........................................................................ 28 Research with humans.............................................................................. 35 Summary.................................................................................................. 47 Rationale for Current Studies............................................................................... 48 Magoon and Critchfield (2005) replication.............................................. 48 ix Investigation of sensitivity differences.................................................... 50 References............................................................................................................ 53 Footnote................................................................................................................ 62 Figure Caption...................................................................................................... 63 Figure................................................................................................................... 64 CHAPTER 2: RESPONSE-CONSEQUENCE CONTINGENCY DISCRIMINABILITY WHEN POSITIVE AND NEGATIVE REINFORCEMENT COMPETE IN CONCURRENT SCHEDULES Introduction.......................................................................................................... 65 GENERAL METHOD Subjects................................................................................................................ 72 Participants............................................................................................... 72 Money Earnings and Payment.................................................................. 72 Setting and Apparatus.......................................................................................... 73 Procedure.............................................................................................................. 73 Instructions............................................................................................... 73 Experimental task..................................................................................... 74 Schedules.................................................................................................. 75 Consequences........................................................................................... 76 Stability criteria........................................................................................ 76 Data Analysis....................................................................................................... 76 x Experiment 1.................................................................................................................... 77 Method................................................................................................................. 77 Results.................................................................................................................. 79 Slopes (Sensitivity) Visual analysis............................................................................. 80 Descriptive analysis..................................................................... 80 Normative analysis....................................................................... 80 Inferential statistical analysis....................................................... 82 Intercepts (Bias) Visual analysis............................................................................. 84 Descriptive analysis..................................................................... 84 Normative analysis....................................................................... 85 Inferential statistical analysis....................................................... 85 Experiment 1 Results Summary............................................................... 87 Discussion............................................................................................................ 87 Experiment 2.................................................................................................................... 89 Method................................................................................................................. 90 Results.................................................................................................................. 91 Slopes (Sensitivity) Visual analysis............................................................................. 92 Descriptive analysis..................................................................... 92 Normative analysis....................................................................... 93 Inferential statistical analysis....................................................... 93 xi Intercepts (Bias) Visual analysis............................................................................. 93 Descriptive analysis..................................................................... 94 Inferential statistical analysis....................................................... 94 Experiment 2 Results Summary............................................................... 95 Discussion............................................................................................................ 96 Experiment 3.................................................................................................................... 97 Method................................................................................................................. 98 Results.................................................................................................................. 99 Slopes (Sensitivity) Visual analysis........................................................................... 100 Descriptive analysis................................................................... 100 Normative analysis..................................................................... 100 Inferential statistical analysis..................................................... 101 Intercepts (Bias) Visual analysis........................................................................... 101 Descriptive analysis................................................................... 101 Inferential statistical analysis..................................................... 102 Experiment 3 Results Summary............................................................. 103 Discussion.......................................................................................................... 103 Experiment 4.................................................................................................................. 104 Method............................................................................................................... 106 Results................................................................................................................ 106 xii Slopes (Sensitivity) Visual analysis........................................................................... 107 Descriptive analysis................................................................... 107 Normative analysis..................................................................... 108 Inferential statistical analysis..................................................... 108 Intercepts (Bias) Visual analysis........................................................................... 108 Descriptive analysis................................................................... 108 Inferential statistical analysis..................................................... 109 Experiment 4 Results Summary............................................................. 110 Discussion.......................................................................................................... 110 GENERAL DISCUSSION Results Summary............................................................................................... 111 Implications for Quantitative Model Development of CDT.............................. 114 Evaluating GMR Parameter Differences........................................................... 118 A Methodological Concern of the Present Investigation................................... 123 Conclusion......................................................................................................... 125 References...................................................................................................................... 127 Footnotes........................................................................................................................ 134 Tables............................................................................................................................. 135 Figure Captions.............................................................................................................. 152 Figures............................................................................................................................ 154 Appendix A.................................................................................................................... 159 xiii Appendix B.................................................................................................................... 164 CHAPTER 3: EXTENDED GENERAL DISCUSSION Introduction........................................................................................................ 176 Studies of Reinforcer Magnitude as a Framework for Studies of Reinforcer Type................................................................................. 178 Comments on the DLOE.................................................................................... 184 On reinforcer type versus magnitude..................................................... 184 Reinforcer type differences and bias...................................................... 184 On the provisional status of the DLOE.................................................. 186 Concatenating the DLOE....................................................................... 187 References...................................................................................................................... 189 Figure Caption................................................................................................................ 192 Figure............................................................................................................................. 193 xiv TABLES AND FIGURES Chapter 1: Literature Review Figure 1. SDT Matrix.......................................................................................... 64 Chapter 2: Response-Consequence Contingency Discriminability when Positive and Negative Reinforcement Compete in Concurrent Schedules Table 1. Experiment 1 Conditions.................................................................... 135 Table 2. Experiment 2 Conditions.................................................................... 137 Table 3. Experiment 3 Conditions.................................................................... 139 Table 4. Experiment 4 Conditions.................................................................... 141 Table 5. Experiment 1 Independent Regressions.............................................. 143 Table 6. Experiment 1 MRC Regressions......................................................... 144 Table 7. Experiment 2 Independent Regressions.............................................. 145 Table 8. Experiment 2 MRC Regressions......................................................... 146 Table 9. Experiment 3 Independent Regressions.............................................. 147 Table 10. Experiment 3 MRC Regressions....................................................... 148 Table 11. Experiment 4 Independent Regressions............................................ 149 Table 12. Experiment 4 MRC Regressions....................................................... 150 Table 13. Overall Results Table........................................................................ 151 Figure 1. Matching functions for Experiment 1................................................ 154 Figure 2. Matching functions for Experiment 2................................................ 155 xv Figure 3. Matching functions for Experiment 3................................................ 156 Figure 4. Matching functions for Experiment 4................................................ 157 Figure 5. Slopes and Slope Differences............................................................ 158 Chapter 3: Extended General Discussion Figure 1. Intercept and Intercept Differences................................................... 193 1 CHAPTER 1 Literature Review Most operant research can be characterized as belonging to one of two broad, and typically mutually exclusive, categories: stimulus control and consequence control. Studies of stimulus control tend to hold constant maximally different consequence conditions (e.g., continuous reinforcement versus extinction) and vary antecedent stimuli. Studies of consequence control tend to hold constant antecedent stimuli and vary one or more different consequence dimensions (e.g., frequency, magnitude, type, delay, etc.). The formulation of a unified theory capable of accounting for results from both domains would be a landmark achievement. Unfortunately, conceptual and quantitative theory development of both stimulus and consequence control have not proceeded in parallel ? the study of consequence control has been arguably more successful (Davison, 1991). Nevertheless, integration of an apparently successful theory of consequence control with the apparently equally successful signal detection theory (SDT) has led to the emergence of contingency discriminability theory (CDT) as a potential unifying theory. The two purposes of this literature review are to describe the course of development of CDT and to propose research that might aid in its further conceptual and quantitative development. The former will be accomplished by first examining the evolution of the matching relation and then by describing its integration with SDT. The latter will be accomplished by 2 reviewing research that used the matching relation to study negative reinforcement, 1 but that may have unintentionally harnessed an effect predicted by CDT, and then by critiquing that research and offering suggestions as to how to improve it. Concurrent Schedules and the Matching Relation Ferster and Skinner (1957) exposed pigeons to two response keys on the same wall of an experimental operant chamber, programmed schedules of reinforcement on each key such that responses on one key did not influence the schedule on the alternate key (i.e., they were programmed independently), and described response patterns on each of the keys. Findley (1958) was the first to characterize behavior under this arrangement beyond mere description. He suggested that preference and switching, rather than the schedule induced behavior on each key, were the two fundamental characteristics of such behavior. He defined switching as an ?alternation in prepotency? of the two operants and preference as when an organism is ?emitting one operant to the partial exclusion of the other? (p. 123). To the extent that an organism is said to choose an option it most prefers, concurrent schedule arrangements provide an experimental vehicle by which to quantify the relative reward value of different conditions of reinforcement (deVilliers, 1977). The strict matching relation. Findley (1958) set out to replicate the findings of Ferster and Skinner (1957). His initial cumulative records made it apparent that pigeons? switches between schedules were only observable when there was a 5 s delay imposed between a switch and the delivery of a reinforcer on the other schedule. As a result, he developed a new procedure (a) to make the switching behavior more explicit, and (b) to specify clearly the stimulus occasion for each response to make it more subject to the control of the experimenter. His new method, the switching-key, changeover-key, or 3 Findley procedure also used two response keys. However, in contrast to the traditional procedure, one key was specified as the response key on which responses were subject to the contingencies specified by the experiment and the other key was designated the switching or changeover (CO) key. The function of the CO key was to alternate the schedules operating on the response key. Typically, the CO key is associated with a light different in color from the ones associated with each of the schedules programmed on the response key. A response on the CO key, then, actually has two functions: (a) to change the schedule operating on the response key; and (b) to change the color of the light associated with the schedule on the response key. Although the two-key procedure and the CO-key procedure seem to be very dissimilar methodologically, empirical evidence suggests that performance on the two procedures appear to be functionally identical (Davison & McCarthy, 1988). Findley (1958) arranged his concurrent schedules so that one schedule remained constant across conditions and the other varied the rate of reinforcers delivered. He found that under these conditions pigeons? response rates on one schedule were increased by increased reinforcement rates on that schedule and were decreased by increased reinforcement rates on the other schedule. This relation was confounded, however, because the absolute reinforcement rate changed from condition to condition. Herrnstein (1961) sought to rectify this problem by holding the overall reinforcement rate constant while varying the distribution of reinforcement delivery across the two keys via differing variable interval (VI) schedules. He exposed pigeons to a two-key procedure and manipulated the use of a changeover delay (COD). Only with a COD in effect did he find an orderly relationship between the relative rates of reinforcement delivered on the two keys and the relative rates of responding on the two keys. The relationship conformed to the quantitative relationship, 21 1 21 1 RR R BB B + = + , (1) where B x is the number of responses to option x and R x is the number of reinforcers delivered on option x. This equation described the general relationship, but was later to be modified according to further empirical findings. Herrnstein (1961) unsuccessfully attempted to fit data from several previous experiments to his new equation. He found that data from single-key experiments, data from multiple schedule experiments, and data from experiments where there was no COD deviated from strict matching (Herrnstein, 1961). Subsequent experiments validated the necessity of the COD (Catania, 1963a; Catania & Cutts; 1963); increased the generality of the relationship across subjects (Catania & Cutts, 1963; Shull & Pliskoff, 1967), procedures (Shull & Pliskoff, 1967), and reinforcers (Shull & Pliskoff, 1967); extended the relationship to matching with the relative size of reinforcers (Catania, 1963b; Neuringer, 1967) and with the relative delays to reinforcers (Chung & Herrnstein, 1967); and extended the equation to include other constants (Catania, 1963a). Including constants in the equation was actually suggested by Herrnstein (1961) when he noted that the function was probably not linear but instead was probably concave downward (i.e., negatively accelerated). In spite of these empirical validations (Catania, 1966), his equation was still not able to account for data obtained with single-key and multiple schedule arrangements. 4 Herrnstein (1970) revised Equation 1 to include two free parameters that would better accommodate data that did not fit the original equation, particularly data from single-key and multiple schedule arrangements. The new equation, o RmRR kR B ++ = 21 1 1 , (2) was based on new assumptions. First, his original equation assumed that the only responses available to the subject were the ones explicitly programmed by the experimenter. He thought this unlikely and probably the basis for the equation not being able to account for responding in single-key procedures. ?A more defensible assumption is that at every moment of possible action, a set of alternatives confronts the animal, so that each action may be said to be the outcome of a choice? (Herrnstein, 1970, p. 254). In essence, he was saying that just because the organism was not responding on the experimental manipulanda does not mean that it is doing nothing. In all likelihood, it is doing something else, albeit unmeasured, that is being reinforced by something that is also unmeasured. The addition of the constant k is meant to quantify all measured and unmeasured responses in units commensurate with the measured response (i.e., k = B 1 + B o ) and the addition of the R o parameter is meant to quantify all unmeasured reinforcers in units commensurate with the experimenter-defined reinforcers (Herrnstein, 1970). A second novel assumption had to do with the interactions of schedules over temporal distance (Herrnstein, 1970), that is, with multiple schedules. In concurrent schedules, all sources of reinforcement can plausibly be assumed to exert their effects on all operating schedules, that is, a full interaction of schedules can be expected (Herrnstein, 1970). However, multiple schedules do not allow for such an assumption. 5 In the case of multiple schedules, the organism is exposed to only one schedule at a time and has no control over changing them. Therefore, it can be assumed that the further removed from one another the schedules are, the less each will interact with the other. This additional assumption was reflected in his quantitative formulation by adding another ?scaling? constant, m, which could range from 0 to 1 representing the degree of interaction between schedules. If there were maximum interaction between schedules (i.e., concurrent schedules), the constant would be 1 and would reduce the equation to the concurrent formulation. Any deviation from maximum interaction would reduce the scaling constant accordingly. Rachlin (1971) criticized Herrnstein?s approach as tautological in that changing quantitative formulations to accommodate new assumptions has merit but does not describe the most appropriate way for an empirical law to evolve. He suggested that new assumptions needed to be borne out from empirical validation. Davison and McCarthy (1988) reviewed how the data collected to the point of their writing failed to reconcile Herrnstein?s (1970) quantitative formulation with his theoretical interpretation primarily due to the demonstrated lack of independence between response parameters and reinforcement parameters (Davison & McCarthy, 1988, pp. 35-38). The generalized matching relation. Baum (1974) noted that in several cases where strict matching (i.e., Herrnstein?s 1961 formulation) could not be found, reexamination of the data via an algebraic equivalent (i.e., as a ratio) brought order to seemingly disorderly data. His generalized matching relation (GMR) thus took the form, a R R c B B ? ? ? ? ? ? ? ? = 2 1 2 1 , (3) 6 and describes a power function similar to what Herrnstein (1961) originally proposed. The log-linear transform of Equation 3 is c R R a B B logloglog 2 1 2 1 + ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? (4) and yields linear functions to which Baum (1974) knew of ?no study of simple concurrent schedules in which the data fail to conform? (p. 231). This equation uses two parameters that can be derived from the data and used to describe various deviations from Herrnstein?s equation, yet it retains the formal properties of Herrnstein?s (1961) original, or strict, matching relation because Equations 3 and 4 reduce to the ratio form of Equation 1 when the added parameters each equal one. The additional parameters are described by Baum (1974) as accounting for what he called bias (log c) and sensitivity (a) of behavior to various experimental (or nonexperimental) conditions. ?Bias means unaccounted for preference? (Baum, 1974, p. 233). It indicates a stable preference between either the responses or the reinforcers within a given experimental arrangement. Sometimes these preferences are unaccounted for as Baum described them and other times they may be manipulated to see how they vary with other independent variables. Other forms of the GMR have specified the various parameters that may be responsible for biased matching (e.g., the concatenated matching relation, see Davison & McCarthy, 1988, for a discussion). It was Baum?s contention that when matching is not found, it is not because the equation is inadequate, or that something is wrong with the organism being studied, but that the experimenter has failed to control or measure all of the relevant independent variables. If, for example, the methods of an experiment are such that everything is held constant and the only difference that remains 7 8 is a particular contingency difference (e.g., positive vs. negative reinforcement), and a bias is revealed, the results could be interpreted as a preference for that programmed difference. Sensitivity is less well understood. Baum (1974) described two types of sensitivity, overmatching and undermatching. Overmatching is the situation where an organism responds more than predicted by strict matching to the schedule providing the more frequent reinforcement (i.e., the richer schedule) and less than predicted to the schedule providing the less frequent reinforcement (i.e., the leaner schedule). Quantitatively, this results in a value of a greater than one. Undermatching is the reverse situation where an organism responds less than predicted to the richer schedule and more than predicted to the leaner schedule. This results in a value of a less than one. The reasons for sensitivity deviations from matching have not yet been fully explicated (Baum, 1974, 1979; Davison & McCarthy, 1988; Davison & Nevin, 1999; deVilliers, 1977; Madden & Perone, 1999). Baum (1974) offered three preliminary possibilities. First, he thought undermatching likely if there were poor discrimination between alternatives. It might be assumed that conversely he thought overmatching would be the product of heightened discrimination between alternatives although he did not say as much. Second, he thought that too short a COD would result in undermatching and that increasing COD?s would shift the matching relation toward overmatching. This possibility is related to the first because presumably a longer COD helps to differentiate the two response options from each other (Findley, 1958; Herrnstein, 1961). Third, he thought that increased deprivation would result in undermatching and decreased deprivation would result in 9 shifts toward overmatching because, when satiated, there are decreases in overall responding and of rate of changeover. He buttresses this point by referring to his own previous research where satiated pigeons conformed to the matching relation in the absence of a COD. In a subsequent paper devoted exclusively to further understanding sensitivity in the matching relation, Baum (1979) evaluated the distribution of sensitivity values (i.e., a values) across all the data he had on hand and offered a new possible reason for deviations from unity. He determined matching equation parameters using both least squares and nonparametric regression procedures for individual subjects from 23 different studies varying across labs, species (rats, pigeons, and humans), responses (lever press, key peck, button press), dependent measures (responses and time), and procedures (changeover-key, two-key, and three-key under forced changeover and independent schedules). His analyses revealed that time measures more closely approximated strict matching (a values distributed around 1.0) than response measures (a values distributed around 0.8). To him this meant that local response rates on the leaner alternative must be different from the rates at the richer alternative. He ascribed these local reinforcement rate differences to asymmetrical pausing. If pausing is favored at the richer alternative, overmatching is observed. If pausing is favored at the leaner alternative, undermatching is observed. Since his own observations revealed the latter to be the norm, he concluded that, in addition to his previous reasons, a further possible reason for observed undermatching is more frequent pausing on the leaner alternative. Baum (1974, 1979) believed unit sensitivity and bias (i.e., strict matching) to be the norm in any standard rigorously controlled concurrent schedules arrangement. He 10 thought bias deviations were a function of either (a) a response bias, (b) a discrepancy between scheduled and obtained reinforcement, (c) qualitatively different reinforcers, or (d) qualitatively different schedules (Baum, 1974). He thought sensitivity deviations were a function of either (a) variables responsible for differentiating operants (e.g., exteroceptive stimuli, COD), (b) changing or differing deprivation levels (Baum, 1974, 1979), or (c) asymmetrical pausing (Baum, 1979). Baum?s conclusions regarding bias have been generally accepted with some added parameters (see Davison & McCarthy, 1988). His conclusions regarding sensitivity have not proven to be fully adequate and have continued to be vetted though various means and methods. SDT and its integration with the GMR offers a promising means by which to understand better the variables influencing sensitivity. Contingency Discriminability Theory As progress was being made in the development of the matching relation, classical SDT (Green & Swets, 1966) was advancing a relatively successful quantitative model that ostensibly separated stimulus effects from other effects (e.g., consequences). Nevin (1969) was the first to identify the possible amalgamation of these two apparently disparate research domains by performing matching analyses of signal detection data. His work began an effort to develop an integrated quantitative model of behavioral detection that could account for both stimulus control and reinforcement effects that would eventually lead to the more general and conceptual CDT. Classical signal detection theory. Signal detection theory has its origins in classical psychophysics, which is concerned primarily with quantitative analyses of sensory thresholds. In a most basic psychophysical absolute threshold experiment, a 11 stimulus is presented repeatedly at both increasing and decreasing intensities. On each trial, the subject reports (verbally with humans, some other operant response with nonhumans) whether they sensed the stimulus. By presenting the stimulus at both increasing and decreasing intensities, the experimenter can determine an average, or absolute threshold, for that stimulus for that subject. Signal detection procedures use similar methods except that the signal is either presented with noise (i.e., some other distracting stimuli) (S 1 ) or the noise is presented without the signal (S 2 ) in discrete trials. This allows four possible outcomes on each trial and is illustrated in Figure 1. When the signal-with-noise (S 1 ) is presented, the subject can report its presence (B 1 ) or its absence (B 2 ) corresponding to a hit (R 11 ) or a miss (R 12 ), respectively. When the noise-without-signal (S 2 ) is presented, the subject can also report its presence (B 1 ) or its absence (B 2 ) corresponding to a false alarm (R 21 ) or a correct rejection (R 22 ), respectively. Since misses and correct rejections are the inverse counts of hits and false alarms, respectively, simply plotting the probability of hits as a function of the probability of false alarms reveals orderly relationships. The curves generated this way are called receiver-operating characteristic curves. Probabilities are used to generate the curves because the theory assumes that probability distributions (usually normal but different versions use different types), one representing the internally observed effects of each stimulus, interact with each other and with other ?subject? variables to determine the response on a particular trial. d' is the measure of the distance between these hypothesized probability distributions (in standard deviation units under the assumption of normal distributions) and represents sensory sensitivity between the two stimuli. ? is the measure of the other ?subject? variables 12 mentioned above. The complete rationale for its value is beyond the scope of this paper (see Green & Swets, 1966, ch. 4), but it is based on assumptions revolving around the ?rationality of maximization.? Suffice it to say that it represents the bias a subject brings to the situation. According to the theory, the bias can be the product of pre-experimental biases, expectations based on instructions, the consequences of responding, and/or any other nonstimulus variables (as defined by the experimenter). The logic behind the determination of ? is, in fact, a part of the basis for Nevin?s (1969) call for integration with the matching relation. More will be said on this below. In any case, the point at which the bias value intersects with the overlapping probability distributions is the decision point. It is further assumed that, for any given observation of a stimulus, the subject transforms the probability distributions into a ?likelihood ratio.? If the likelihood ratio exceeds ?, one response will occur and if it is less than ?, the other will occur. Initial integration. ?The most striking achievement of the theory is that if signal strength is constant, d? remains the same within variations of any given experiment?More importantly, it remains constant across different classes of experiment? (Nevin, 1969, p. 477). This feature, combined with SDT?s failure to deal effectively with response bias and the matching relation?s success in quantifying the effects of differential reinforcement, is probably what made integration of the two so attractive. Signal detection theory isolated stimulus control effects while being unable to explain reinforcer effects, and the matching relation isolated reinforcer effects while being unable to account for stimulus control effects. Several researchers thought that the matching relation might provide a reasonable bias function for SDT. 13 Aside from the conceptual similarity of the two research domains, Nevin (1969) also proposed that multiple concurrent schedules were methodologically parallel to those used in classical signal detection research. Typical SDT research follows each correct response (i.e., hits and correct rejections) with some sort of reinforcement (e.g., food, feedback, points) and each incorrect response (i.e., misses and false alarms) with punishment or extinction. A multiple concurrent FR1 Ext. concurrent Ext. FR1 schedule mimics this procedure and, if arranged in discrete trials, exactly replicates it. When one component of the multiple schedule, signaled by a distinct stimulus (S 1 ), is presented, one response (B 1 ) is reinforced (R 11 ) and the other (B 2 ) is not (R 12 ). When the other component, signaled by a different distinct stimulus (S 2 ), is presented, the previously reinforced response (B 1 ) is no longer reinforced (R 21 ) and the other one (B 2 ) is (R 22 ). Nevin (1981) reported a behavioral detection model, initially presented at the 1977 meeting of the Psychonomic Society, which used Herrnstein?s (1961) strict matching relation for its bias function. He started by noting that application of the strict matching relation to the signal detection paradigm leads to a problem. If reinforcement is delivered only for B 1 and never for B 2 in the presence of S 1 (i.e., R 11 = reinforcement, R 12 = extinction), and only for B 2 and never for B 1 in the presence of S 2 (i.e., R 22 = reinforcement, R 21 = extinction), then only B 1 should occur in the presence of S 1 and only B 2 should occur in the presence of S 2 . That is, there should be perfect responding, no errors. In contrast to that expectation, errors are found to occur consistently under such circumstances (Nevin, 1981). Nevin (1981) proposed that the effect of a reinforcer for a response in the presence of one stimulus generalizes to strengthen the same response in the presence of the other stimulus and that the subject distributes its responses to match the ratio of the combination of direct and generalized reinforcement. He incorporated a new parameter, ?, designating the similarity of S 1 and S 2 , to account for the generalization of reinforcement effects. The resulting equations were, in the presence of S 1 , 22 11 2 1 R R B B ? = , (5) and in the presence of S 2 , 22 11 2 1 R R B B ? = . (6) He suggested that 1/? was an index of discriminability closely related to Green and Swet?s (1966) d?. Furthermore, from his new equations he could easily derive an isosensitivity curve equation and an isobias curve equation that were identical to a previously developed linear learning model based on ?rather abstract considerations of choice theory? (Nevin, 1981, p. 12). He described his model as leading ?to predictions that cannot readily be distinguished from those of the elaborate conceptual machinery of classical signal detection theory? (Nevin, 1981, p. 14). 14 At the same time as Nevin was presenting his model in 1977, Davison and Tustin (1978) independently developed their own version of the behavioral detection model. The primary difference between the models was that Davison and Tustin (1978) used Baum?s (1974) GMR as the basis for their bias function instead of the strict matching relation. Furthermore, they proposed a different conceptual mechanism through which differential stimuli influenced response distribution. In their model, differential stimuli did not influence responding simply through stimulus generalization as proposed by Nevin (1981), but more specifically, it assigned a biasing role for stimuli much like that for reinforcers. In the presence of S 1 , responding became more biased toward B 1 , and in the presence of S 2 , responding became more biased toward B 2 . Their new equations were, in the presence of S 1, a R R cd B B ? ? ? ? ? ? ? ? = 22 11 2 1 , (7) and in the presence of S 2 , a R R d c B B ? ? ? ? ? ? ? ? = 22 11 2 1 . (8) The log-linear transform of Equations 7 and 8 are, in the presence of S 1 , dc R R a B B R loglogloglog 22 11 1 2 1 ++ ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? , (9) and in the presence of S 2 , dc R R a B B R loglogloglog 22 11 2 2 1 ?+ ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? . (10) a Rx represents sensitivity to changes in reinforcement rate in the presence of stimulus x. At this point in the development of their model, they assumed sensitivity to reinforcement to be the same in the presence of S 1 as it is in the presence of S 2 (i.e., a R1 = a R2 ). This assumption would later prove to be the source of an important change in their model. Log d is the bias caused by the signaling stimulus. Equations 9 and 10 both have the same response in the response ratio numerator (B 1 ), thus log d is added in Equation 9, reflecting a bias toward that response in the presence of S 1 , and is subtracted in Equation 10, reflecting a bias away from that response in the presence of S 2 . It is easier to see in its log-linear form that they were treating stimulus effects similarly to reinforcement effects 15 16 in that they each biased responding. It was the first suggestion that no distinction between the discriminative and reinforcing functions of stimuli is logically necessary. Two points are clear from Equations 9 and 10. First, when there is no discrimination between stimuli (i.e., log d = 1) Equations 7 and 8 and Equations 9 and 10 reduce to Equations 3 and 4 (the GMR), respectively, thus preserving the fundamental matching relationship. Second, stimulus biases (log d) are independent of both inherent bias (log c) and reinforcement effects [a log (R 11 /R 22 )] thus preserving the fundamental SDT discriminability and bias independence. Furthermore, the isosensitivity curve and isobias curve equations derived from Davison and Tustin?s (1978) equations (p. 333 Equation 10 and p. 334 Equation 13) are mathematically identical to those found by Nevin (1981) (and by extension to a previous linear learning model) and Green and Swets (1966). That the same equations were independently derived from several different perspectives is, indeed, of interest. Testing the models and their assumptions. Subsequent research extended the Davison-Tustin model to account for procedures that arranged reinforcement for ?errors? (Davison & McCarthy, 1980) following the suggestion of the Nevin (1981) model. Thus, the Davison-Tustin model seemed to integrate successfully the considerations of all research perspectives (but see Nevin, Jenkins, Whittaker, & Yarensky, 1982 for a critique). That is, until Miller, Saunders, and Bourland (1980) provided evidence that the Davison-Tustin assumption of reinforcement sensitivity invariance across stimulus disparity differences (McCarthy & Davison, 1979, 1980) was invalid. The research done by Davison and McCarthy up to Miller et al.?s (1980) paper used discrete-trials methods (e.g., Davison & McCarthy, 1980; McCarthy & Davison, 17 1979, 1980). Outside the context of Davison, McCarthy, and Nevin?s work, Miller et al. (1980) investigated the role of varying stimulus disparity in a free-operant changeover- key procedure (Findley, 1958) using Baum?s (1974) original GMR. They exposed three groups of pigeons to different line orientations on the response key. Group 1 was the 0- degree disparity group, that is, the line orientations indicating which component of the concurrent schedule was operating were not different from each other within subject but were different across subjects within group. Group 2 was the 15-degree disparity group and Group 3 was the 45-degree disparity group. All subjects were exposed to comparable relative reinforcement rate conditions. Their results clearly showed that Baum?s (1974) measure of sensitivity (a) increased with increasing stimulus disparity and with increased absolute rates of reinforcement (see also Nevin, et al., 1982 for more on absolute rates of reinforcement). The fundamental problem was that the Davison-Tustin model did not address the factors that determine sensitivity to reinforcement. They simply continued to include it, as Baum (1974) had originally proposed, as a free parameter to be derived from the data. In light of the evidence from Miller, et al. (1980) this position no longer was tenable. Subsequent research tended to refute the assumptions that sensitivity to changes in relative reinforcement was independent of stimulus disparity (i.e., a R1 = a R2 ) and absolute rates of reinforcement (Davison, McCarthy, & Jensen, 1985; Logue, 1983; McCarthy, Davison, & Jenkins, 1982; White, Pipe, & McLean, 1984). Consequently, Davison and Jenkins (1985) proposed a replacement for Baum?s (1974) GMR using a measure of response-consequence contingency discriminability in place of Baum?s free parameter a. The proposed equation was ? ? ? ? ? ? ? ? + + = 12 21 2 1 RRd RRd c B B br br , (1) where d br measures the discriminability of the operant classes signaled by the reinforcing stimuli. Thus, this version of the matching relation has no free parameters; it has only a measure of inherent bias (log c) and a measure of reinforcement bias that is a function of frequency of reinforcement moderated by the ?confusability? of the response- consequence contingency. The confusability they defined as being partly due to the subject?s sensory ability and partly due to the stimuli signaling the reinforcement contingencies. The process described by Equation 11 is similar to the generalized strengthening effect across stimuli proposed by Nevin (1981). d br varies between 1 (no response- consequence contingency discriminability) and infinity (perfect response-consequence contingency discriminability). When d br = 1 the subject cannot discriminate which reinforcers resulted from which response. Consequently, reinforcers for B 2 (i.e., R 2 ) affect B 1 to the same degree as reinforcers for B 1 (i.e., R 1 ) and vice-versa, the effective reinforcement ratio reduces to 1, and the slope of the log-linear regression line (i.e., the matching function) is zero indicating no differential responding despite changes in relative rates of reinforcement. When d br = infinity, the effects of R 2 on B 1 and the effects of R 1 on B 2 are minimized, effectively to zero, resulting in a relative reinforcement ratio determined solely by R 1 and R 2 . The resulting equation is Baum?s (1974) GMR with no sensitivity parameter (i.e., a = 1). This model does not allow for overmatching. 18 Using their version of the matching relation in place of Baum?s (1974) GMR, Davison and Jenkins (1985) proposed a new version of the Davison and Tustin (1978) model of behavioral detection. The resulting equations were, in the presence of S 1 , ? ? ? ? ? ? ? ? + + = 1122 2211 2 1 RRd RRd cd B B br br sb , (12) and in the presence of S 2 , ? ? ? ? ? ? ? ? + + = 1122 2211 2 1 RRd RRd d c B B br br sb , (13) where d sb serves the same function as d does in the Davison-Tustin model, but is subscripted to differentiate stimulus-response contingency discriminability from response-consequence contingency discriminability. These equations maintain the independence of inherent bias, stimulus bias, and reinforcement bias and maintain the fundamental matching relationship, but are more specific regarding the measure of behavioral sensitivity to changes in relative rates of reinforcement. Furthermore, by treating stimulus effects and reinforcement effects similarly mathematically, this formulation extends Davison and Tustin?s (1978) earlier contention that discriminative stimuli and reinforcers act on behavior in similar ways. Although the Davison-Jenkins model appeared to be conceptually defensible, Alsop (1991) and Davison (1991) revealed a quantitative logical flaw. Equations 12 and 13 assume independent effects of d sb and d br which they contend ?is not a reasonable prediction? (Alsop, 1991, p. 43). The Davison-Jenkins formulation mathematically asserts that if there is no response-consequence contingency discriminability (i.e., d br = 1), stimulus-response contingency discriminability will still exert its own independent 19 effect on behavior. That is, the model predicts differential responding to two distinct stimuli even if the subject cannot discriminate differential reinforcement between the two responses. It is not reasonable to predict, for example, that a red light signaling one reinforcement contingency and a green light signaling another reinforcement contingency could differentially affect behavior if the reinforcement contingencies themselves were indiscriminable from each other. To remedy the quantitative flaw of Davison and Jenkin?s (1985) model, Alsop (1991) proposed the following modifications: in the presence of S 1 , ? ? ? ? ? ? ? ? + + = 2211 2211 2 1 RdRd RRdd c B B brsb brsb , (14) and in the presence of S 2 , ? ? ? ? ? ? ? ? + + = 1122 1122 2 1 RRdd RdRd c B B brsb brsb . (15) Like the Davison-Jenkins detection model, when d sb = 1 Equations 14 and 15 reduce to Davison and Jenkin?s (1985) matching relation (Equation 11). However, unlike the Davison-Jenkins model, when d br = 1, Equations 14 and 15 predict equal responding to B 1 and B 2 . Alsop (1991) provided the mathematical justification for the new version of the behavioral detection model, but Davison (1991) provided the initial conceptual justification. To better illustrate its implications it is helpful to perform some simple mathematical operations and rewrite the equations as, in the presence of S 1 , 20 ? ? ? ? ? ? ? ? ? ? ? ? + + = sbbr brsb d R d R dd R R c B B 2211 22 11 2 1 , (16) and in the presence of S 2 , ? ? ? ? ? ? ? ? ? ? ? ? + + = 22 11 2211 2 1 R dd R d R d R c B B brsb brsb . (17) Davison (1991) argued that every ?stimulus/response/reinforcer combination?is discriminable from other such combinations as a function of the joint stimulus[-response contingency] and [response-consequence] contingency discriminabilities of the two combinations? (pp. 65-66). Referring to Equation 16, it can be seen that, in the limits of perfect stimulus-response and response-consequence contingency discriminability (i.e., d sb = d br = infinity), the only effective reinforcement to influence B 1 is R 11 ; the other terms in the reinforcement bias portion of the equation functionally reduce to zero. Consequently, in the presence of S 1 , only B 1 responses would be expected to occur (assuming no inherent bias, log c). To extend the example to Equation 17, perfect stimulus-response and response-consequence contingency discriminability would result in a reinforcement bias of zero. Thus, in the presence of S 2 , there is no effective reinforcement for B 1 and therefore no such responses would be expected to occur. In the other extreme, the limits of zero stimulus-response and response-consequence contingency discriminability (i.e., d sb = d br = 1) lead to reinforcement bias values of 1 in both Equations 16 and 17. Accordingly, response ratio values in the presence of each of the stimuli would only be a function of inherent bias (log c). 21 22 It is also important to understand how stimulus-response and response- consequence contingency discriminabilities function independently of one another. If there were no stimulus-response discriminability and perfect response-consequence contingency discriminability (i.e., d sb = 1 and d br = infinity), Equations 16 and 17 would both reduce to Herrnstein?s (1961) strict matching equation with Baum?s (1974) bias parameter. In effect, response distributions would be expected to be a strict function of relative reinforcement distribution independent of any stimulus effects. In the case of perfect stimulus-response discriminability and zero response-consequence contingency discriminability (i.e., d sb = infinity and d br = 1), reinforcement bias effects would reduce to 1 in the presence of both stimuli and response distributions would only be a function of inherent bias. That inherent bias is the only factor responsible for unequal response distributions whenever d br = 1 demonstrates the presumed primacy of consequences by the Law of Effect (Thorndike, 1911). Stimulus differences can only exert an effect (i.e., stimulus control) if there is some degree of response-consequence contingency discriminability. Davison and Nevin (1999) reviewed and developed the model under a signal- detection (i.e., conditional discrimination) paradigm and then extended it to see how well it described existing data in the areas of complex conditional discriminations, reinforcement for ?errors? in conditional discriminations, multiple schedules, and simple concurrent schedules. They began by clarifying some of the model?s most fundamental assumptions. First, that the concurrent discriminated operant is the fundamental analytic unit. Second, that discriminative stimuli and reinforcers function similarly (i.e., they both effect behavior to the extent that their respective contingencies with their respective 23 responses are discriminable). Third, that two processes occur in determining steady-state behavior (in its current form, the model does not account for acquisition, transition, or extinction). The first process is one by which their discriminability parameters interact with environmental conditions to determine the effective allocation of reinforcers. The second process is one by which behavior allocation strictly matches the effective reinforcer distributions. They went on to describe fully the model as it applies to a 2x2 conditional discrimination analogous to the standard signal detection arrangement (i.e., a multiple concurrent VI Ext concurrent Ext VI schedule) emphasizing the interrelationships of d sb and d br (Davison & Nevin, 1999, pp. 447-452). Of particular interest here was how the model formally integrated the concept of stimulus generalization as responsible for the effective allocation of reinforcement effects following Nevin (1981). They demonstrated the flexibility of the model in how it could be adapted to account for performance in 2x2 conditional discrimination procedures with reinforcers for ?errors? and in complex conditional discriminations consisting of multiple stimuli and multiple responses. Davison?s approach to model building is to let model changes be driven by data and then to test each new version against existing data (Davison, 1991). Any model meant to replace an existing model must not neglect the data that led to the development of previous models. Consequently, Davison and Nevin (1999) illustrated the goodness of fit of their model to 2x2 conditional discriminations, value transfer, reinforcement for errors, matching to sample and its variants (identity v. symbolic matching, delayed matching and delayed reinforcement, and second-order discrimination of mixed delays), complex stimulus discrimination, multiple stimuli and multiple correct responses, multiple schedules, and concurrent schedules. In each case, they concluded that it described the data well. Of particular interest to the present discussion is the model?s prediction regarding conventional two-response concurrent schedules. Davison and Nevin (1999) made the case that in such an arrangement there is no conditional discrimination, just simple discrimination between alternatives. Accordingly, there is no S 1 -S 2 difference and therefore no need for two equations. The corresponding equation is, ? ? ? ? ? ? ? ? ? ? ? ? + + = br br d R R d R R c B B 1 2 2 1 2 1 . (18) They described how this formulation successfully accounts for the data from studies that manipulated ?extraneous? reinforcement, overall reinforcement rates, extreme ratios of reinforcement, and concurrent VI Ext schedules. However, they acknowledged that the model is not able to account for the effects of other choice-controlling variables such as reinforcer magnitude, delay, and quality, which are known to contribute to determination of behavior allocation in concurrent schedules. The theoretical problem is that when one of these dimensions is differentially arranged for two or more choices, the reinforcer ?value? (Baum & Rachlin, 1969) is changed concurrently with its discriminability. They offered two possibilities, but concluded, ?a substantial research effort will be required to identify independent functions for the discriminability and value of the consequences of choice? (p. 472). Nevertheless, a prediction of their specific model, and of CDT generally, is that qualitatively different reinforcers differentially presented in a two 24 25 choice arrangement should add to the response-consequence contingency discriminability of the responses. This prediction arises from work on the differential outcomes effect. The Differential Outcomes Effect Trapold (1970) is credited with the first known study on what has come to be called the differential outcomes effect (DOE). He was investigating the inferred role of the s g component of Spence?s (1956) r g -s g mechanism. He reasoned that expectancies formed throughout the course of establishing the S-R association during instrumental conditioning could function as discriminative stimuli in much the same way as exteroceptive stimuli. He proposed that if qualitatively different reinforcers were presented in a discrete trial conditional discrimination task, different expectations arising from the different reinforcers would each have their own differing stimulus properties. These stimulus properties should add to the stimulus complex and should therefore result in faster learning than if there were no differences between reinforcers (and thus discriminative stimuli). He exposed rats to a discrimination problem that required a response to one lever in the presence of a clicking sound and a response to another lever in the presence of a tone. He found an increased rate of acquisition and greater accuracy when correct responses to the clicking sound produced food and correct responses to the tone produced a sucrose solution than when both responses in the presence of either stimulus produced the same reinforcer. The DOE is a robust phenomenon. Goeters, Blakely, & Poling (1992) reviewed and summarized the DOE literature to date and identified 38 studies designed specifically to test for it. They described how it had been demonstrated across a wide range of subjects including pigeons, rats, dogs, and humans, with a variety of consequences 26 including food versus water, with different delays to reinforcement, and with both within- and between-subject experimental designs. However, all studies included in their review used discrete trials methods, typically some variation of a matching to sample procedure. In fact, no DOE studies since have been found to use a free-operant procedure. Furthermore, they identified only 7 studies that had been done with human subjects, all of which were classified as ?mentally retarded? or autistic, and two of which were ?methodologically weak? (Goeters et al., 1992. p. 398) clinical reports with only one subject. Furthermore, while Goeters et al. criticized the various studies? theoretical analyses because they continued to rely on expectancies, their proposed analysis is not much more convincing. Their reliance on private events, interoceptive stimuli, and proprioceptive stimuli seems equally unamenable to direct investigation. An alternative theoretical analysis was offered by Wixted (1989) and Alling, Nickel, and Poling (1991). They suggested that sample-specific responding could enhance discrimination simply by increasing the discriminability of the sample stimuli (i.e., increase d sb ). This idea was quickly refuted by Urcuioli (1991) who clearly demonstrated retarded acquisition when particular outcomes always followed the same sample stimuli and unreliably followed the comparison stimuli, and facilitated acquisition when the same outcomes always followed the same sample and comparison stimuli. He concluded, These results clearly demonstrate that differential outcomes do not affect conditional discrimination learning merely by enhancing the discriminability or distinctiveness of the sample with which they are associated. Rather, they apparently give rise to another discriminative cue (viz., an outcome expectancy), 27 which can either enhance or interfere with performance, depending on its predictive validity. (p. 29). It is unclear how confusable the outcomes may have been, but one outcome was food (delivered in the feeder) and the other was the feeder light being illuminated. If nothing else, the spatial proximity of the two outcomes could have contributed to contingency discriminability (d br ) and could thus have functioned as the other discriminative cue referred to by Urcuioli (1991). It could be possible that an analysis of differential outcomes in the context of CDT could lead ?to predictions that cannot readily be distinguished from those of the elaborate conceptual machinery of? (Nevin, 1981, p. 14) expectations as discriminative stimuli. The utility of the GMR in testing the DOE. Baum?s (1974) GMR has been shown to be a generally reliable predictor of nonhuman operant behavior (deVilliers, 1977 but see Davison & McCarthy, 1988), has been shown in many cases to be suitable for describing human operant behavior (Pierce & Epling, 1983; but see Kollins, Newland, & Critchfield, 1997), and has been tested and expanded to consider more of the variables responsible for control of behavior (see Davison & McCarthy, 1988; Davison & Nevin, 1999). As such, it may be considered a useful tool with which to address some unanswered questions. A concurrent schedules arrangement allows qualitatively different consequences to compete directly for control of behavior and the GMR?s bias and sensitivity parameters (log c and a, respectively) provide a means of quantitatively describing and comparing the behavioral response to those different outcomes. Magoon and Critchfield (2005) took just this approach to investigate the relative effects of positive and negative reinforcement on human behavior. They were interested in whether 28 a systematic bias would emerge if the two contingency types competed for behavior, but found unanticipated results. To understand the relevance of their findings to CDT, it will be informative first to understand previous efforts to compare positive reinforcement to negative reinforcement within a generalized matching paradigm. The Matching Relation and Negative Reinforcement Research with nonhumans. Catania (1966) was the first to write a comprehensive review of research on concurrent schedules. In that review, he discussed the research involving concurrent schedules of positive reinforcement and concurrent schedules of negative reinforcement. There was little at the time and none of it had been evaluated in terms of Equation 1. The primary concern of those previous studies was with dependent variables such as rate constancy and response independence (Catania, 1966). Another reason why Equation 1 was not used prior to Catania?s writing was methodological in nature. The programming of schedules of negative reinforcement up to that point was based on Sidman?s (1953) shock delay procedure where shocks are presented at regular, experimenter-defined intervals and each response delays the next shock presentation for a period also specified by the experimenter. This procedure does not allow response rates to vary with changes in various dimensions of reinforcement (e.g., rate or magnitude) because of the temporal regularities imbedded in the programming (deVilliers, 1972). deVilliers (1972) was the first to respond to Hineline?s (1970) suggestion that aversive conditioning procedures (i.e., negative reinforcement and punishment) should be studied in ways analogous to those used to study positive reinforcement. He viewed Herrnstein?s (1970) revised equation as the vehicle that would allow such comparisons to be made. However, in order to do that, he needed to develop a procedure for 29 programming schedules of negative reinforcement that would circumvent the problems associated with the Sidman?s (1953) method. Sidman (1966) was the first to propose using fixed-cycle schedules to study negative reinforcement. These types of schedules are different from Sidman?s (1953) schedules in that high rates of responding do not produce ever-increasing delays to shock. In fixed-cycle procedures, the first response made in an interval deletes the next shock and subsequent responses have no further effect until a new cycle begins. deVilliers (1972) modified the procedure to eliminate the predictability associated with fixed intervals by varying the intervals according to Fleshler and Hoffman?s (1962) progression. The resulting variable cycle (VC) schedules of negative reinforcement share many of the same properties of VI schedules of positive reinforcement (Baron, 1991). With his new procedure in hand, deVilliers (1972) was ready to examine the properties of negative reinforcement in much the same way that positive reinforcement had been researched. In his first experiment, he arranged single and multiple schedules of shock maintained negative reinforcement for rats bar pressing and found contrast in the multiple schedule components similar to what Bloomfield (1967) found with positive reinforcement. He then derived the free parameters in Herrnstein?s (1970) equation for multiple schedules from the data using both received shock rate and shock-frequency reduction as the reinforcers. To test the validity of Herrnstein?s equations, he then used the obtained parameters and the obtained reinforcement rates from the single schedule components to solve for Herrnstein?s equation for single schedule performance again using both received shock rate and shock frequency reduction as the reinforcers. The goal was to determine which conception of reinforcement was most appropriate for 30 studying negative reinforcement. He found that Herrnstein?s single response equation predicted responding better when shock frequency reduction was used as the reinforcer maintaining responding. Although this finding was important to the one- vs. two- factor theory-of- avoidance debate (e.g., Anger, 1963; Baum, 2001; Dinsmoor, 2001; Herrnstein, 1969; Herrnstein & Hineline, 1966; Mowrer 1947; Rescorla & Solomon, 1967; Schoenfeld, 1950; Sidman, 1962), the more important observation for present purposes is that ?the relationship between response rate and negative reinforcement in random-interval [read as VC] avoidance schedules is accurately described by the formulation of the law of effect developed by Herrnstein for positive reinforcement? (deVilliers, 1972, p. 506). This was the first attempt to examine negative reinforcement using similar procedures as those used to study positive reinforcement. Although deVilliers (1972) validated the use of Herrnstein?s equations to study negative reinforcement in the same way as positive reinforcement, it did little to compare the two contingencies. deVilliers (1974) ran a parametric study using the same procedures as he used before (deVilliers, 1972). In his first experiment, he exposed rats to an array of single VC schedules of negative reinforcement. He then fit the data to Herrnstein?s (1970) equation for single schedule performance and found that, when shock-frequency reduction was used as the reinforcer, it described the performance accurately. Furthermore, he described his results as comparing favorably to those obtained by Catania and Reynolds (1968) for pigeons responding on VI schedules of food reinforcement. In his second experiment, he used multiple VC schedules of negative reinforcement and varied the duration of the components. He suggested that the 31 component duration in which the relative response rates were maximal would be the point at which interaction between schedules was maximal. If Herrnstein?s interaction parameter is at maximum (i.e., equal to 1), that is the point where matching would be expected. He found that maximal relative response rates were obtained with brief component durations (40 s) and that it was at that point where matching was obtained using Herrnstein?s (1970) equation for multiple schedules. These and other features of the data were compared to results obtained with positive reinforcement and the general conclusion was that, if shock-frequency reduction is taken as the reinforcer for negative reinforcement, both can be integrated into the same conceptual framework. These were not direct comparisons, however. They were comparisons made across laboratories, subjects, reinforcers, and procedures. Other means of comparison needed to be made. Baum and Rachlin (1969) investigated the possibility that relative rates of reinforcement matched relative time rather than relative response rates on concurrent schedules. To do so, they arranged an apparatus designed to record time spent in each of two simultaneously presented schedules while requiring nonspecific responses in those schedules. Food reinforcement was presented at variable times while the pigeon was in each component. They found that the pigeons matched the relative time spent in each schedule to the relative reinforcement obtained from each schedule. Baum (1973) replicated Baum and Rachlin?s (1969) procedures exactly except using timeout from shock as the reinforcer instead of food delivery. The physical appearance of the apparatus and the subjects used were the same as those used by Baum and Rachlin (1969). By holding as much as possible constant across experiments except for the type of reinforcer, some of the difficulties found with deVilliers interpretations 32 were avoided. The data were graphed both independently and in combination with the data from Baum and Rachlin?s (1969) study using positive reinforcement. Graphed independently, the relative time spent in each schedule matched the relative number of timeouts from shock. When viewed together the data from both studies showed some marked similarities. First, the changeover patterns were similar although the range of changeovers was significantly more for negative reinforcement than it was for positive reinforcement. Second, across studies, the bias and sensitivity parameters were comparable for some subjects and averaged data were very similar. The general implications of the results are that (a) the reduction of shock plays the same role in negative reinforcement as does the rate of food presentation in positive reinforcement and (b) that the matching relation can serve as the vehicle by which direct comparisons can be made between the two contingency types. Logue and deVilliers (1978) investigated whether behavior and time allocation maintained by simultaneously concurrent (as opposed to multiple or concurrent changeover-key procedures) schedules of negative reinforcement could also be described by Baum?s (1974) equation. They arranged a standard operant chamber with two levers and exposed rats to various combinations of VC schedules of negative reinforcement using the same programming tactic as had been used by deVilliers (1972, 1974). After a careful and extensive shaping procedure, they found that the data best fit the matching equation when relative time allocation ratios were graphed as a function of the relative ratios of avoided shocks. The data were also described well by the relative ratios of responses. In addition, using responses and avoided shocks, the parameters from Baum?s 33 (1974) GMR equation were comparable to those found in concurrent schedules of positive reinforcement. Nonhuman behavior maintained by single and multiple VC schedules of negative reinforcement had been shown to be described by Herrnstein?s (1970) equations (deVilliers, 1972, 1974), time allocation maintained by concurrent VC schedules of negative reinforcement had been shown to be described by what was to become Baum?s (1974) GMR equation (Baum, 1973), and behavior and time allocation maintained by concurrent VC schedules of negative reinforcement had been shown to be described by Baum?s (1974) GMR equation (Logue & deVilliers, 1978). All of these studies paralleled the findings from methodologically similar studies with positive reinforcement. When the outcomes of these studies are considered together, the matching relation seems to provide an appropriate framework from which to evaluate the relative effects of positive and negative reinforcement. Up to this point however, positive and negative reinforcement had only been compared via similar procedures. The most appropriate way to evaluate the relative effects of positive and negative reinforcement would be to pit them against one another directly in a concurrent arrangement and see how the data conform to the GMR. Logue and deVilliers (1981) arranged equal concurrent VI schedules of milk reinforcement for rats. Once responding was established and stabilized, they introduced a third independent VC schedule of shock avoidance on one lever at a time. A modified version of the GMR was used to evaluate the relationship between reinforcement and response rates. This version assumed that the reinforcing effects of milk and shock avoidance would be additive. In such a case, the ratio of response rates would be 34 expected to match the ratio of reinforcement rates when either the numerator or the denominator of the reinforcement ratio consisted of number of milk reinforcers delivered plus the number of shocks avoided and the other component of the ratio consisted of only the number of milk reinforcers delivered. One other parameter was included in this formulation of the GMR. A scaling constant had to be included to account for possible differences in the reinforcing effects of milk and avoidance of shock. The estimation of the scaling constant will be discussed below. When schedules of shock avoidance were overlaid on the schedules of milk reinforcement both responses and time spent responding increased on that schedule while the number of food reinforcers received did not vary. Unfortunately, probably due to poor counterbalancing procedures, baselines were extremely variable both within and across subjects complicating the analysis. Consequently, their dependent measures were not direct response and time measures, but were the percentage difference between the response and time measures of a shock condition and its preceding baseline condition. It is unclear how this data manipulation could have affected other data analyses. An even more troubling aspect of their data analysis is method by which they calculated and concluded the scaling constant. The authors fit nine constants ranging from .1 to 100 to the data with their equation and ostensibly used the constant that resulted in the highest correlation coefficients for time and responses. They concluded that the constant that fit the data the best was 1.0 (r = .81, .71, .84, .86, respectively) meaning that one avoided shock was equal to one milk reinforcer, however they reported seemingly better correlations with a scaling constant of 100 (r = .82, .75, .85, .84, respectively). Why they chose to conclude that 1.0 was the better scaling constant is not clear. Furthermore, that 35 the correlations were so similar for scaling constants differing by two orders of magnitude calls into question the validity of the scaling procedure itself. Nevertheless, after fitting their data to their modified GMR equation using a scaling constant of 1.0, the authors concluded that their version of the GMR used in the experiment proved successful in describing response strength involving both food and shock avoidance as reinforcers. Although this experiment was the most direct comparison of positive and negative reinforcement up to the time, it has limited relevance because of the issues involved with deriving the scaling constant (see also Farley & Fantino, 1978). Properly evaluating the relative effects of positive and negative reinforcement (or any other qualitatively different consequences) in an operant context would require using quantitatively equal consequent stimuli programmed on identical schedules differing only by contingency type. This could be accomplished, of course, only with consequences that can be readily compared on a unit-by-unit basis. Human experimental procedures offer a distinct advantage in this regard because they typically employ conditioned reinforcers such as money as the basis for operant consequences (Pilgrim, 1998). Money gains and money losses are measured, quite literally, in the same currency, and thus are readily compared on a unit-by-unit basis. Research with humans. Research on concurrent schedules of positive versus negative reinforcement with human subjects in the context of any version of the matching relation is sparse. Only three studies were found that took this approach. Ruddle, Bradshaw, and Szabadi (1981) devised an apparatus composed of two panels with colored lights. One panel was the main panel on which were three rows of five lights, a 36 counter below the rows of lights, an additional green light to the left of the counter, a red light to the right of the counter, and a response button below the counter facing the subject that could be depressed by a force of 6 N (600 g). The second panel was smaller and had only three lights and a response button that could be depressed by a force of 2 N (200 g). The top row of lights on the main panel was orange, the middle row was blue, and the bottom row was white. On the smaller panel, the left light was orange, the middle one was blue, and the right one was white. Three subjects participated in a series of thirty 70 min sessions. Each session consisted of 10 min exposures to one of five different schedules with 5 min of rest between each exposure. The experiment was divided into three phases, ten sessions to each phase. The first phase involved negative reinforcement only, the second involved positive reinforcement only, and the third involved both. Negative reinforcement was programmed using VC schedules (deVilliers, 1972, 1974) and positive reinforcement was programmed using VI schedules. Points exchangeable for money were used as reinforcers (1 point = 1 pence) and were accumulated within sessions and across days. Thus, whatever value remained on the counter at the end of the daily sessions was added to cumulative daily earnings and a lump sum was paid at the conclusion of the study. In the first phase of the experiment, the counter was programmed to begin at 200 at the start of each daily session. Subjects were given instructions indicating that when a white light was on they could occasionally lose points. A point loss would be signaled by the brief illumination of the red light and a decrease of the counter by one. Subjects were given nonspecific instructions about the response button. Each of the white lights was correlated with a different schedule of VC negative reinforcement. Except for the first 37 (training) session where the schedules were presented (and signaled) in descending order from the most dense (far left light) to the least dense (far right light), the various VC schedules were presented in a quasi-random sequence. Within sessions, each of five schedules was signaled by one of the five lights without replacement. Across sessions, no schedule could be associated with the same light it had been during the previous session. In the second phase of the experiment, the counter was programmed to begin at zero at the start of each daily session. Subjects were given a new set of instructions indicating that when an orange light was on they could occasionally earn points by pressing the button on the main panel (the small panel had not been introduced yet). An earned point would be signaled by the brief illumination of the green light and an increase of the counter by one. The same procedure for presenting the schedules and their associated stimuli was used as was used in the previous phase. In the third phase of the experiment, it is not stated specifically whether there were any preprogrammed supplemental earnings to parallel the conditions in phase one or if the counter began at zero as in phase two, but their subject instructions make it appear as if there was not. This phase introduced the smaller panel for the first time. Subjects were given instructions indicating that only the white lights would be operating on the main panel. However, pressing the button on the small panel would turn off the white lights on the main panel and turn on the orange light on the smaller panel. In effect, the button on the small panel functioned as a changeover key (Findley, 1958) and the button on the main panel was the response button associated with both the white and the orange lights. The same procedures for presenting the VC schedules of negative reinforcement 38 were used in this phase as were used in phase one with the addition of the concurrently available schedule of VI positive reinforcement signaled by the orange light. No COD was employed. The data for the single schedule phases (i.e., phases 1 and 2) were analyzed using Herrnstein?s (1970) equation for single schedules. The equation accounted for 69%, 95%, and 82% of the variance for each of the subjects under negative reinforcement conditions and 93%, 98%, and 59% of the variance under positive reinforcement conditions, respectively. The primary question, however, concerned how the concurrent performance data would fit Baum?s (1974) GMR. When the logarithms of the ratio of responses (responses to negative/responses to positive) were plotted as a function of the logarithms of ratio of reinforcers (point losses avoided/points earned) marked undermatching was found for all subjects. In one case, the slope of the regression line did not deviate significantly from zero. No bias was observed in any case. Time allocation data were consistent with response rate data. Ruddle, Bradshaw, Szabadi, & Foster (1982) replicated Ruddle, et al. (1981) with the goals of extending the findings and exploring the possibility that a lack of a COD in their previous experiment was the reason for the marked undermatching. The procedures were the same as Ruddle et al. (1981) except that the single schedules of positive and negative reinforcement were run merely as preliminary training sessions and were run for only one session each. The primary independent variable for this experiment was length of COD. Three subjects were run under conditions where there was a 5 s COD, two of those three subjects were rerun through all conditions using a 2 s COD, and one of the remaining subjects was rerun through all conditions without any COD. 39 With a 5 s COD, undermatching was found for one of the three subjects. The slopes of the regression lines for the other subjects did not significantly deviate from one. One of the three subjects showed a slight bias towards positive reinforcement while the other two subjects exhibited no bias. The time allocation data were not different from the response rate data. With a 2 s COD, the slopes of both subject?s matching functions decreased. One of the subject?s data showed marked undermatching with a bias toward negative reinforcement. The other subject?s data did not show either a bias or a slope different from one. Again, the time allocation data were not different from the response rate data. The response rate data from the only subject run during the no COD conditions again showed a decrease in the slope, but it was not significantly different from one and showed no bias. There was also no bias demonstrated when considering time allocation data, but there was a slight tendency toward undermatching. These results suggest that positive and negative reinforcement contingencies are functionally symmetrical in the context of the matching relationship and that the avoidance of point loss exerts an equal effect on behavior as point gain (i.e., there was no orderly bias). However, there is reason to be concerned with some aspects of the methods employed by Ruddle et al. (1981) and Ruddle et al. (1982). First, the schedules used for the positive and negative reinforcement conditions were not identical. VC schedules of negative reinforcement are programmed such that the first response in an interval cancels the delivery of the next programmed shock (or point loss) and subsequent responses have no effect until a new interval begins (deVilliers, 1972, 1974). In contrast, VI schedules of positive reinforcement are programmed such that responses within each interval have no effect until the end of an interval when a reinforcer is made available. 40 Only then, once an interval has timed out, does a response result in the delivery of a reinforcer. Although these schedules may bear notable similarities (Baron, 1991), it remains unclear if this difference affects response patterns differently. A second concern with Ruddle et al.?s (1981) and Ruddle et al.?s (1982) methods is that they did not incorporate the use of a baseline against which to measure relative differences between consequences. Under their procedures, such a comparison would be between concurrent schedules of VI positive versus VI positive reinforcement and concurrent schedules of VC negative versus VI positive reinforcement. Ruddle et al.?s results might only show that performance on concurrent schedules of VC negative versus VI positive reinforcement can be adequately described by the GMR similarly to concurrent schedules of VC negative versus VC negative reinforcement and to concurrent schedules of VI positive versus VI positive reinforcement. Their results cannot necessarily be interpreted to mean that the different consequences exert a similar effect on behavior without some sort of within-subject or between-groups comparison. A concern specific to Ruddle et al.?s (1982) methods is their use of a COD. A COD used with concurrent schedules of positive versus positive reinforcement is an interval, initiated by a switch between response options, during which no responses are effective and thus no positive reinforcers are earned. Note that a COD is programmed either to pause schedule timers or to make responses ineffective at producing the consequence for a specified length of time. Both of these methods create unique difficulties with concurrent schedules of negative versus positive reinforcement. If schedule timers are paused during the COD, then a positive reinforcer cannot be delivered for responding on the manipulandum associated with the positive reinforcement 41 schedule and a loss cannot occur for not responding on the manipulandum associated with the schedule of negative reinforcement. In this case, changing over from positive to negative reinforcement would functionally create a safety period, and could therefore promote changing over from positive to negative reinforcement. The same COD discourages changing over to positive reinforcement (Brownstein & Pliskoff, 1968; Catania, 1966; deVilliers, 1977; Herrnstein, 1961; and Shull & Pliskoff, 1967). Thus, this type of COD would be expected to promote ?preference? for negative reinforcement. If the COD renders responses temporarily ineffective, then the possibility arises for responding on the negative reinforcement side to be punished adventitiously, thereby promoting a ?preference? for the positive reinforcement contingency. A fourth problem with Ruddle et al.?s (1981) and Ruddle et al.?s (1982) methods arises from the unique situation that resulted from their programming independent schedules using a changeover-key (Findley, 1958) arrangement with VC schedules of negative reinforcement. In phase 3, while a subject was working on the schedule of positive reinforcement the light associated with the schedule of negative reinforcement was turned off indicating the inability to affect that schedule. Since the schedules were programmed independently, the schedule of negative reinforcement continued to time and points could still have been lost with no clear indication as to whether the loss was due to a response associated with the light on the small panel, was response independent, or was due to the timing of the schedule of negative reinforcement. That is, the subject could not have been able to discriminate the source of the point loss. Using nonindependent schedules could have avoided this problem because no point loss would ever have been experienced while the subject worked on the positive reinforcement 42 schedule, but would have resulted in its own problem. If schedules had been programmed nonindepenently, points could not have been lost on the schedule of negative reinforcement while a subject worked on the schedule of positive reinforcement and points could not have set up on the schedule of positive reinforcement while a subject worked on the schedule of negative reinforcement. Under such conditions, it would be likely that exclusive preference for the schedule of positive reinforcement would occur since, with this response pattern, points would only be gained and never lost. A final criticism of the Ruddle et al. (1981) and Ruddle et al. (1982) methods is their inconsistent use of the points counter. It is unclear what effect having a visible counter may have on performance especially if it is programmed differently across phases of the experiment. In Ruddle et al.?s (1981) phase one (single schedules of negative reinforcement), the counter started at 200, and, if the subject did not respond at all, 432 points would be lost per daily session resulting in a balance of -232. In phase two (single schedules of positive reinforcement), the counter started at zero and, with perfect responding by the subject (i.e., no losses), 126 points would be earned per daily session resulting in a balance of 126. Assuming the counter started at zero in phase three (each phase one schedule of negative reinforcement concurrently available with a single phase two schedule of positive reinforcement), if all reinforcers were earned on both schedules, the final daily balance would be 59 (no losses, 59 gained). If all losses were avoided and no positive reinforcers were earned, the final daily balance would be zero (no losses, no gains). If all positive reinforcers were earned and no losses were avoided, the final daily balance would be -373 (432 lost, 59 gained). If no reinforcers were earned from either schedule, the final daily balance would be -432 (432 lost, no gains). Thus, the range of 43 daily session earnings was different across phases (phase one: -232 ? 200; phase two: 0 ? 126; phase three: -432 ? 59) and the counter started at different values across phases. These differences across phases are threats to the internal validity of the study to the extent that these stimulus differences might influence behavior (e.g., differential establishing operations, Michael, 1982, 1993, 2000). Magoon and Critchfield (2005) devised a procedure following Madden and Perone (1999) to overcome the deficiencies found with Ruddle et al.?s (1981) and Ruddle et al.?s (1982) methods. Subjects faced a computer monitor displaying a software program that divided the screen into two equal left and right halves. Within each half was a target box that slowly moved in random directions within its half of the screen. The subject?s task was to use an attached computer mouse to position the curser over one or the other of the two target boxes and press the mouse button. Reinforcers (points exchangeable for money) could be lost or earned according to the programming of the active schedule. As a solution to the asymmetry of schedule structure found in the Ruddle studies, both schedules of negative and positive reinforcement were programmed following deVilliers (1972, 1974) VC structure. In both cases, the first response in an interval influenced the end-of-interval consequence and subsequent responses within the interval had no programmed effect. With negative reinforcement, the first response in an interval cancelled the scheduled point loss at the end of the interval with no signal and further responses in the interval had no effect. If no response occurred during the interval, point loss was signaled at the end of the interval by a flashing stimulus indicating the amount of money lost. With positive reinforcement, the first response in an interval earned the 44 positive reinforcer scheduled for that interval signaled by a flashing stimulus indicating the amount of money earned and further responses in that interval had no effect. If no response occurred during the interval, there was no signal at the end of the interval and the next interval began timing. Magoon and Critchfield (2005) also used a baseline matching function consisting of concurrent schedules of VC positive versus VC positive reinforcement to compare against the experimental matching function consisting of concurrent schedules of VC negative versus VC positive reinforcement. An identical range of relative reinforcement ratios was used across each phase (i.e., matching function) to ensure symmetry of frequency of reinforcement. The use of a baseline allowed for relative comparisons to be made between phases rather than to the presumed standard bias (log c = 0) and slope (a = 1) values of Baum?s (1974) GMR. This way, Baum?s GMR parameters could be used as metrics of comparison between phases rather than against assumed ideals. To overcome the COD difficulties posed by Ruddle et al.?s (1982) procedure, Magoon and Critchfield (2005) employed a compromise changeover method that combined the more traditional elements of concurrent schedule arrangements with a ?Findley-type? changeover response (COR). The apparatus itself more closely approximated those used in typical concurrent schedule studies of nonhumans because both response options were visible and operational simultaneously (i.e., both target boxes were constantly visible and their schedules operated continuously). This is in contrast to Ruddle et al?s apparatus that more closely approximated Findley?s (1958) procedure (i.e., one response option, one changeover key, and different stimuli signaling different schedules). In addition to the physical differences in apparatus, Magoon and Critchfield 45 (2005) used a COR button that was located between the two screen sides and was labeled ?change.? Five responses on the button executed a changeover. This is functionally equivalent to a COD because it takes time to execute the COR (Baum, 1982). Schedule timers did not pause and responses remained effective on the active target box except that a target-box response prior to the completion of the five changeover responses reset the changeover response requirement. The only time schedule timers paused was during the flashing signals indicating point loss or point gain. This arrangement allowed point loss to occur during a changeover, but the subject was responding on the changeover button not on the target box associated with the schedule of negative reinforcement thereby preventing adventitious punishment of that operant. Furthermore, since schedules never stopped timing during a changeover, no safety periods existed that could promote changing over to negative reinforcement. An important point regarding independence of schedules in the context of negative reinforcement is that the schedules must be highly discriminable from one another. To the degree that schedules are indiscriminable, they could be considered one larger, more complex operant (i.e., subjects would not be able to discriminate the source of point loss). The spatial separation of the target boxes and the COR went some way to distinguish the operants in Magoon and Critchfield?s study, but discriminability was presumably increased by some additional stimuli. At the beginning of a session, both sides of the screen were white rectangles. The first response on either side of the screen changed the white rectangle on the other side black, presented the words ?Mouse On? below the white half of the screen, and presented the words ?Mouse Off? below the black half of the screen. Both target boxes remained visible and active, but only the schedule 46 with the white background and the words ?Mouse On? associated with it could be influenced by a subject?s responses. The completion of a COR changed the white background to black, the black background to white, the words ?Mouse On? under the previously active side to ?Mouse Off,? and the words ?Mouse Off? under the previously inactive side to ?Mouse On.? To differentiate the operants further, colors were randomly chosen from a pool of 16 and used as colors for the target boxes. The colors within condition (i.e., each relative reinforcement ratio) were always different from each other and efforts were made ensure equal use of all colors across conditions. Finally, when points were lost on a schedule of negative reinforcement, the target box associated with that schedule flashed red and indicated the amount of money lost, and when points were gained on a positive reinforcement schedule, the target box associated with that schedule flashed the amount of money gained. The spatial separation of the target boxes, the COR, the added stimuli, the differentially colored target boxes, and the explicit signals indicating which target box was the source of which consequence presumably maximized discriminability, and thus independence, between schedules. Since it was unclear what affect the visible points counter might have on responding, Magoon and Critchfield (2005) did not use a visible counter during any part of the study. Instead, a message was presented on the screen at the end of each session informing the subject how much money had been earned in that session. Presession supplements were used for all negative reinforcement conditions and were set so that if no responding occurred, earnings for that schedule would equal zero (i.e., a session could never end at a negative value). 47 Three subjects were exposed to the following VC schedules (relative reinforcement ratios in parentheses) for both concurrent schedules of VC positive versus VC positive reinforcement baselines and concurrent schedules of VC negative versus VC positive reinforcement experimental conditions with appropriate counterbalancing: VC20:VC20 (1:1), VC15:VC30 (2:1), VC13:VC50 (4:1), VC12:VC72 (6:1), and VC11:VC100 (9:1). One subject withdrew from the study prematurely, leaving behind a data set incorporating fewer reinforcement ratio comparisons than that of the other two, but adequate to support planned analyses. No subject showed a significant bias for either condition (compared against the log c = 0 ideal of the GMR) and, more importantly, VC positive versus VC positive reinforcement matching function bias values were not different from VC negative versus VC positive reinforcement matching function bias values. This means that, in the context of these procedures, there is no systematic bias for either type of reinforcement. However, there was an orderly and pronounced, sensitivity (i.e., slope) difference between phases. In every case, the VC negative versus VC positive reinforcement matching function was steeper than the VC positive versus VC positive reinforcement matching function. While these results indicate that neither reinforcement contingency worked better, was stronger, was more effective, generated more behavior, et cetera, they also indicate that positive and negative reinforcement are in some way qualitatively different. If valid, this outcome would correspond with the CDT prediction of a free-operant DOE when two qualitatively different outcomes compete for behavior. Summary. Considering the evidence presented above, there is good reason to propose that positive and negative reinforcement appear to be equally amenable to 48 investigation through the procedural and analytical conventions associated with concurrent schedules. This outcome is demonstrated most clearly when the two types of consequences can be compared on some scale of measurement independent of their effects on behavior, as is the case when humans work under schedules of generalized conditioned reinforcement such as money. However, as evidenced by the matching function slope differences found by Magoon and Critchfield (2005), positive and negative reinforcement are not phenomenologically identical. The two types of contingencies yield different kinds of response-contingent stimulus changes, so they differ perceptually. That these two kinds of reinforcement are perceptually dissimilar makes them useful in more thoroughly testing the predictions of CDT. Rationale for Current Studies Magoon and Critchfield (2005) replication. Magoon and Critchfield?s (2005) results demonstrated no bias differences but systematic sensitivity differences when performance under schedules of concurrent VC positive versus VC positive reinforcement was compared with performance under schedules of concurrent VC negative versus VC positive reinforcement within subject. It was noted earlier that reasons for sensitivity differences are unclear. Davison and Nevin?s (1999) model of CDT suggests that the slope differences could be due to d sb or to d br differences. Although they suggest that d sb is not a factor in a simultaneously available two-response concurrent schedules arrangement, and care was taken to ensure that the two responses were as discriminable from one another as possible (note all of the stimuli used to indicate a separation of the target boxes), it remains possible that the differential use of target box colors could have somehow been differentially correlated with phase thus 49 incorporating a nonunit value of d sb . It is questionable whether d sb can ever be made infinitely large, but it is possible to hold it constant across conditions by using the same target box color for both response options regardless of the difference in reinforcement contingency. Accordingly, any difference between conditions would be a function only of response-consequence contingency differences (i.e., d br ). Following the procedures of Ruddle et al. (1981) and Ruddle et al. (1982), Magoon and Critchfield (2005) programmed presession supplemental earnings in negative reinforcement phases (although no visible counter was used). The primary rationale for this was to avoid sessions where subjects would earn very little or no money or would lose money. One of the difficulties with studies of the matching relation with human subjects is the amount of time per session and the number of session required to complete the necessary conditions. If a great many of these resulted in low or negative earnings, subject retention could become difficult. Supplements were employed to avoid this eventuality. However, using supplements for negative reinforcement creates an asymmetry with positive reinforcement. As noted above, the supplements were programmed such that, in any given session, earnings could not go below zero. That is, just like in positive reinforcement conditions, if the subject emitted no responses during a session the end-of-session message would indicate earnings of zero. The problem with this is that, although the number of reinforcers is equal to the equivalent reinforcement ratio in concurrent schedules of positive reinforcement, the negative reinforcers are functionally ?double-valued.? Not only does earning a negative reinforcer avoid the loss, but in terms of the final amount earned for the session, it also amounts to a gain equal to the gain of a positive reinforcer. This is most evident when comparing final session totals 50 for comparable relative reinforcement rates for negative and positive reinforcement: session earnings were roughly equal. Since subject retention was the reason for the use of supplements, a solution to the problem would be to make a portion of their hourly earnings contingent on attendance rather than strictly on performance and begin every session with the counter set at zero. End-of-session totals would differ across reinforcement ratios, but all reinforcers would be quantitatively of equal magnitude. Magoon and Critchfield (2005) ran only three subjects. While the findings were consistent and orderly across all three, the finding needs intersubject replication (Sidman, 1960) to advance generality of findings. Consequently, a first experiment would be a systematic replication of Magoon and Critchfield (2005) using identically colored target boxes and no supplemental earnings. Successful replication of previous findings with these procedural differences would provide support for the reliability and validity of the slope changing effects found when negative reinforcement competes with positive reinforcement in a concurrent schedules arrangement and would indicate that the slopes differences were a function of only response-consequence contingency discriminability as predicted by Equation 18. Investigation of sensitivity differences. Baum?s (1974) GMR offers a metric for comparing behavioral sensitivity between matching functions, but it does not unambiguously imply whether any operations (and if so, what operations) should affect the sensitivity parameter (a). CDT generally, and Davison and Nevin?s (1999) model specifically, offers some insight. They predict that a differential outcome will increase the response-consequence contingency discriminability (d sb ) between alternatives. Given that positive and negative reinforcement are both processes that increase (or maintain) 51 ongoing operant behavior and the proposed procedures presumably hold d sb constant and use quantitatively equal outcomes, it is not unreasonable to suggest that sensitivity differences between concurrent schedules of VC positive versus VC positive reinforcement and concurrent schedules of VC negative versus VC positive reinforcement is somehow simply a function of their differential outcomes. The outcomes of positive and negative reinforcement differ in two ways. First, positive reinforcement involves gains and negative reinforcement involves losses. The important differential outcome between positive and negative reinforcement could simply be that behavior is more sensitive to losses than it is to gains. Second, there is a structural feedback asymmetry between positive and negative reinforcement. Consider a session where 10 reinforcers could be earned and the subject successfully earned 7 of them. If they were earned via positive reinforcement, the subject would be presented 7 feedback stimuli over the course of the entire session (one for each reinforcer earned). If they were earned via negative reinforcement, the subject would be presented 3 feedback stimuli over the course of the entire session (one for each reinforcer missed). The important differential outcome between positive and negative reinforcement could be this feedback asymmetry. By testing the necessity and sufficiency of both ?loss vs. gain? and ?feedback asymmetry? by manipulating them across experiments and comparing each manipulation to an appropriate baseline condition within subject, the functional and fundamental difference between positive and negative reinforcement could be isolated. If one of these variables proves necessary and sufficient and the other does not, valuable information regarding the difference between positive and negative reinforcement would be gained, 52 but a significant challenge would be presented both to CDT generally and to Davison and Nevin?s (1999) model specifically (i.e., because a case would be presented where outcomes differed but there were no sensitivity differences). If neither ?loss vs. gain? nor ?feedback asymmetry? proves both necessary and sufficient to increase sensitivity and all that is required is any difference between outcomes, little would be added to knowledge regarding positive reinforcement vis-?-vis negative reinforcement, but CDT generally and Davison and Nevin?s model specifically would receive further support. 53 References Alsop, B. L. (1991). Behavioral models of signal detection and detection models of choice. In M. L. Commons, J. A. Nevin, & M. C. Davison (Eds.), Signal detection: Mechanisms, models, and applications (pp. 39-55). Hillsdale, NJ: Erlbaum. Alling, K., Nickel, M., & Poling, A. (1991). The effects of differential and nondifferential outcomes on response rates and accuracy under a delayed- matching-to-sample procedure. The Psychological Record, 41, 537-549. Anger, D. (1963). The role of temporal discriminations in the reinforcement of Sidman avoidance behavior. Journal of the Experimental Analysis of Behavior, 6, 477- 506. Baron, A. (1991). Avoidance and punishment. In I. H. Iverson & K. A. Lattal (Eds.), Experimental analysis of behavior. Part 1 (pp. 173-217). Amsterdam: Elsevier. Baum, W. M. (1973). Time allocation and negative reinforcement. Journal of the Experimental Analysis of Behavior, 20, 313-322. Baum, W. M. (1974). On two types of deviation from the matching law: Bias and undermatching. Journal of the Experimental Analysis of Behavior, 22, 231-242. Baum, W. M. (1979). Matching, undermatching, and overmatching in studies of choice. Journal of the Experimental Analysis of Behavior, 32, 269-281. Baum, W. M. (1982). Choice, changeover, and travel. Journal of the Experimental Analysis of Behavior, 38, 35-49. Baum, W. M. (2001). Molar versus molecular as a paradigm clash. Journal of the Experimental Analysis of Behavior, 75, 338-341. 54 Baum, W. M., & Rachlin, H. C. (1969). Choice as time allocation. Journal of the Experimental Analysis of Behavior, 12, 861-874. Bloomfield, T. M. (1967). Some temporal properties o f behavioral contrast. Journal of the Experimental Analysis of Behavior, 10, 159-164. Brownstein, A. J., & Pliskoff, S. S. (1968). Some effects of relative reinforcement rate and changeover delay in response-independent concurrent schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 11, 683-688. Catania, A. C. (1963a). Concurrent performances: Reinforcement interaction and response independence. Journal of the Experimental Analysis of Behavior, 6, 253-263. Catania, A. C. (1963b). Concurrent performances: A baseline for the study of reinforcement magnitude. Journal of the Experimental Analysis of Behavior, 6, 299-300. Catania, A. C. (1966). Concurrent operants. In W. K. Honig (Ed.), Operant behavior: Areas of research and application (pp. 213-270). New York: Appleton-Century- Crofts. Catania, A. C., & Cutts, D. (1963). Experimental control of superstitious responding in humans. Journal of the Experimental Analysis of Behavior, 6, 203-208. Catania, A. C., & Reynolds, G. S. (1968). A quantitative analysis of the responding maintained by interval schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 11, 327-383. Chung, S., & Herrnstein, R. J. (1967). Choice and delay of reinforcement. Journal of the Experimental Analysis of Behavior, 10, 67-74. 55 Davison, M. C. (1991). Stimulus discriminability, contingency discriminability, and complex stimulus control. In M. L. Commons, J. A. Nevin, & M. C. Davison (Eds.), Signal detection: Mechanisms, models, and applications (pp. 57-78). Hillsdale, NJ: Erlbaum. Davison, M., & Jenkins, P. E. (1985). Stimulus discriminability, contingency discriminability, and schedule performance. Animal Learning & Behavior, 13, 77-84. Davison, M., & McCarthy, D. (1980). Reinforcement for errors in a signal-detection procedure. Journal of the Experimental Analysis of Behavior, 34, 35-47. Davison, M., & McCarthy, D. (1988). The matching law: A research review. Hillsdale, NJ: Lawrence Erlbaum Associates. Davison, M., McCarthy, D., & Jensen, C. (1985). Component probability and component reinforcer rate as biasers of free-operant detection. Journal of the Experimental Analysis of Behavior, 44, 103-120. Davison, M., & Nevin, J. A. (1999). Stimuli, reinforcers, and behavior: An integration. Journal of the Experimental Analysis of Behavior, 71, 439-482. Davison, M. C., & Tustin, R. D. (1978). The relation between the generalized matching law and signal detection theory. Journal of the Experimental Analysis of Behavior, 29, 331-336. deVilliers, P. A. (1972). Reinforcement and response rate interaction in multiple random-interval avoidance schedules. Journal of the Experimental Analysis of Behavior, 18, 499-507. 56 deVilliers, P. A. (1974). The law of effect and avoidance: A quantitative relationship between response rate and shock-frequency reduction. Journal of the Experimental Analysis of Behavior, 21, 223-235. deVilliers, P. A. (1977). Choice in concurrent schedules and a quantitative formulation of the law of effect. In W. K. Honig & J. E. R. Staddon (Eds.), Handbook of operant behavior (233-287). Englewood Cliffs, NJ: Prentice Hall. Dinsmoor, J. A. (2001). Stimuli inevitably generated by behavior that avoids electric shock are inherently reinforcing. Journal of the Experimental Analysis of Behavior, 75, 311-333. Farley, J. & Fantino, E. (1978). The symmetrical law of effect and the matching relation in choice behavior. Journal of the Experimental Analysis of Behavior, 29, 37-60. Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement. New York: Appleton-Century-Crofts. Findley, J. D. (1958). Preference and switching under concurrent scheduling. Journal of the Experimental Analysis of Behavior, 1, 123-144. Fleshler, M., & Hoffman, H. S. (1962). A progression for generating variable-interval schedules. Journal of the Experimental Analysis of Behavior, 5, 529-530. Goeters, S., Blakely, E., & Poling, A. (1992). The differential outcomes effect. The Psychological Record, 42, 389-411. Green, D. M., & Swets, J. A. (1966). Signal-detection theory and psychophysics. New York: Wiley. 57 Herrnstein, R. J. (1961). Relative and absolute strength of response as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior, 4, 267-272. Herrnstein, R. J. (1969). Method and theory in the study of avoidance. Psychological Review, 76(1), 49-69. Herrnstein, R. J. (1970). On the law of effect. Journal of the Experimental Analysis of Behavior, 13, 243-266. Herrnstein, R. J., & Hineline, P. N. (1966). Negative reinforcement as shock-frequency reduction. Journal of the Experimental Analysis of Behavior, 9, 421-430. Hineline, P. N. (1970). Negative reinforcement without shock reduction. Journal of the Experimental Analysis of Behavior, 14, 259-268. Kollins, S. H., Newland, M. C., & Critchfield, T. S. (1997). Human sensitivity to reinforcement in operant choice: How much do consequences matter? Psychonomic Bulletin & Review, 4, 208-220. Logue, A. W. (1983). Signal detection and matching: Analyzing choice on concurrent variable-interval schedules. Journal of the Experimental Analysis of Behavior, 39, 107-127. Logue, A. W., & deVilliers, P. A. (1978). Matching in concurrent variable-interval avoidance schedules. Journal of the Experimental Analysis of Behavior, 29, 61- 66. Logue, A. W., & deVilliers, P. A. (1981). Matching of behavior maintained by concurrent shock avoidance and food reinforcement. Behaviour Analysis Letters, 1, 247-258. 58 Madden, G. J., & Perone, M. (1999). Human sensitivity to concurrent schedules of reinforcement: Effects of observing schedule-correlated stimuli. Journal of the Experimental Analysis of Behavior, 71, 303-318. Magoon, M. A., & Critchfield, T. S. (2005). Concurrent-schedules of positive and negative reinforcement: Differential-impact and differential-outcomes effects. Manuscript submitted for publication. McCarthy, D., & Davison, M. (1979). Signal probability, reinforcement and signal detection. Journal of the Experimental Analysis of Behavior, 32, 373-386. McCarthy, D., & Davison, M. (1980). Independence of sensitivity to relative reinforcement rate and discriminability in signal detection. Journal of the Experimental Analysis of Behavior, 34, 273-284. McCarthy, D., Davison, M., & Jenkins, P. E. (1982). Stimulus discriminability in free- operant and discrete-trial detection procedures. Journal of the Experimental Analysis of Behavior, 37, 199-215. Michael, J. (1982). Distinguishing between discriminative and motivational functions of stimuli. Journal of the Experimental Analysis of Behavior, 37, 149-155. Michael, J. (1993). Establishing operations. The Behavior Analyst, 16, 191-206. Michael, J. (2000). Implications and refinements of the establishing operation concept. Journal of Applied Behavior Analysis, 33, 401-410. Miller, J. T., Saunders, S. S., & Bourland, G. (1980). The role of stimulus disparity in concurrently available reinforcement schedules. Animal Learning & Behavior, 8, 635-641. 59 Mowrer, O. H. (1947). On the dual nature of learning-a re-interpretation of ?conditioning? and ?problem solving.? Harvard Educational Review, 17, 102- 148. Neuringer, A. J. (1967). Effects of reinforcement magnitude on choice and rate of responding. Journal of the Experimental Analysis of Behavior, 10, 417-424. Nevin, J. A. (1969). Signal detection theory and operant behavior: A review of David M. Green and John A. Swets? ?Signal-detection theory and psychophysics.? Journal of the Experimental Analysis of Behavior, 12, 475-480. Nevin, J. A. (1981). Psychophysics and reinforcement schedules: An integration. In M. L. Commons & J. A. Nevin (Eds.), Quantitative analyses of behavior: Vol. 1. Discriminative properties of reinforcement schedules (pp. 3-27). Cambridge, MA: Ballinger. Nevin, J. A., Jenkins, P., Whittaker, S., & Yarensky, P. (1982). Reinforcement contingencies and signal detection. Journal of the Experimental Analysis of Behavior, 37, 65-79. Pierce, W. D., & Epling, W. F. (1983). Choice, matching, and human behavior: A review of the literature. The Behavior Analyst, 6, 57-76. Pilgrim, C. (1998). The human subject. In K. A. Lattal and M. Perone (Eds.), Handbook of Research Methods in Human Operant Behavior (pp. 15-44). New York: Plenum. Rachlin, H. C. (1971). On the tautology of the matching law. Journal of the Experimental Analysis of Behavior, 15, 249-251. 60 Rescorla, R. A., & Solomon, R. L. (1967). Two-process learning theory: Relationships between Pavlovian conditioning and instrumental learning. Psychological Review, 74(3), 151-182. Ruddle, H. V., Bradshaw, C. M., & Szabadi, E. (1981). Performance of humans in variable-interval avoidance schedules programmed singly, and concurrently with variable-interval schedules of positive reinforcement. Quarterly Journal of Experimental Psychology, 33, 213-226. Ruddle, H. V., Bradshaw, C. M., Szabadi, E., & Foster, T. M. (1982). Performance of humans in concurrent avoidance/positive-reinforcement schedules. Journal of the Experimental Analysis of Behavior, 38, 51-61. Schoenfeld, W. N. (1950). An experimental approach to anxiety, escape and avoidance behavior. In P. H. Hoch and J. Zubin (Eds.), Anxiety (pp. 70-99). New York: Grune & Stratton. Shull, R. L., & Pliskoff, S. S. (1967). Changeover delay and concurrent schedules: Some effects on relative performance measures. Journal of the Experimental Analysis of Behavior, 10, 517-527. Sidman, M. (1953). Avoidance conditioning with brief shock and no exteroceptive warning signal. Science, 118, 157-158. Sidman, M. (1960). Tactics of scientific research: Evaluating experimental data in psychology. New York: Basic Books. Sidman, M. (1962). Reduction of shock frequency as reinforcement for avoidance behavior. Journal of the Experimental Analysis of Behavior, 5, 247-257. 61 Sidman, M. (1966). Avoidance behavior. In W. K. Honig (Ed.), Operant behavior: Areas of research and application (448-498). New York: Appleton-Century- Crofts. Spence, K. W. (1956). Behavior theory and conditioning. New Haven: Yale University Press. Thorndike, E. L. (1911). Animal intelligence: Experimental studies. New York: Macmillan. Trapold, M. A. (1970). Are expectancies based upon different positive reinforcing events discriminably different? Learning and Motivation, 1, 129-140. Urcuioli, P. J. (1991). Retardation and facilitation of matching acquisition by differential outcomes. Animal Learning & Behavior, 19, 29-36. White, K. G., Pipe, M-E., & McLean, A. P. (1984). Stimulus and reinforcer relativity in multiple schedules: Local and dimensional effects on sensitivity to reinforcement. Journal of the Experimental Analysis of Behavior, 41, 69-81. Wixted, J. T. (1989). Nonhuman short-term memory: A quantitative reanalysis of selected findings. Journal of the Experimental Analysis of Behavior, 52, 409-426. 62 Footnote 1 The term ?negative reinforcement? is typically defined as the increase or maintenance of responding by either the removal or avoidance of some stimulus. By this definition, the term can refer to two procedures; one wherein on ongoing stimulus is terminated contingent on the response (escape), and a second wherein a stimulus that has previously been presented in the absence of responding is avoided contingent on the response (avoidance). The more general term ?negative reinforcement? will be used throughout the paper to refer to the specific case of avoidance rather than escape. 63 Figure Caption Figure 1. The four possible outcomes from a signal detection trial. S 1 is the presentation of the signal-plus-noise stimulus and S 2 is the noise-only stimulus. B 1 is a ?signal present? response and B 2 is a ?signal absent? response. Hits (R 11 ) and Correct Rejections (R 22 ) are typically followed by reinforcement and Misses (R 12 ) and False Alarms (R 21 ) are typically followed by punishment or extinction. Figure 1 Response B 1 B 2 HIT R 11 C. R. R 22 F. A. R 21 MISS R 12 S 1 Stimulus S 2 64 65 CHAPTER 2 RESPONSE-CONSEQUENCE CONTINGENCY DISCRIMINABILITY WHEN POSITIVE AND NEGATIVE REINFORCEMENT COMPETE IN CONCURRENT SCHEDULES Most operant research can be characterized as belonging to one of two broad, and typically mutually exclusive, categories: consequence control and stimulus control. Studies of consequence control tend to hold antecedent stimuli constant and vary one or more different consequence dimensions (e.g., frequency, magnitude, type, delay, etc.). Studies of stimulus control tend to hold maximally different consequence conditions constant (e.g., continuous reinforcement versus extinction) and vary antecedent stimuli. The formulation of a unified theory capable of accounting for the research results from both domains would be a landmark achievement. The generalized matching relation (GMR) (Baum, 1974) is a quantitative expression of a general theory of choice that has proven successful in characterizing the results from a variety of studies of consequence control. Its generality has been demonstrated across species, procedures, and behaviors and has been extended to describe the effects of a range of independent variables (see Davison & McCarthy, 1988, for a review). In its most general form for a two-ply concurrent schedules arrangement, the equation c X X a B B x logloglog 2 1 2 1 + ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? ? (1) describes the relationship between the distribution of responses (B) and the distribution of the different dimensions of reinforcement (X) (e.g., frequency, magnitude, type, delay) across the two measured response options (1 and 2). Rarely is a perfect one to one relationship (i.e., strict matching, Herrnstein, 1961) observed, and the parameters a x and log c describe deviations from this ideal. The a x parameter is a measure of behavioral sensitivity to changing distributions of the independent variable (X) (Baum, 1979) and has been described as measuring ?the degree of control that those variables exert over preference? (Landon, Davison, & Elliffe, 2003). The log c parameter is a measure of response bias and reflects a constant preference for responding to one option over the other regardless of the ratio(s) of the independent variable(s) (Baum, 1974). Most concurrent schedules research has focused on varying the relative frequency of reinforcement. Consequently, the most familiar form of Equation 1 is c R R a B B r logloglog 2 1 2 1 + ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? (2) where B 1 and B 2 represent the number of responses to each option, R 1 and R 2 represent the number of reinforcements obtained from responding to each option, a r measures behavioral sensitivity to changing ratios of reinforcement frequency, and log c measures constant response bias. Data obtained from concurrent schedules research typically are evaluated by plotting logarithms of response ratios as a function of the logarithms of reinforcement ratios, and then employing least-squares linear regression to obtain estimates of a r and log c. Behavioral sensitivity is reflected as the estimate of a r , which is 66 67 the slope of the regression line. Strict matching is revealed by a slope of one, but the distribution of empirical slope values across studies tends to center around .8 (Baum, 1979; Kollins, Newland, & Critchfield, 1997; Robinson, 1992; Wearden & Burgess, 1982). Constant response bias is reflected as a nonzero estimate of log c and a shift of the regression line up or down on the ordinate, depending on for which option there is bias. Operant research on stimulus control has not provided a comparably effective quantitative model. Davison (1991) argued that this is why stimulus control research has declined in popularity in recent decades. However, at about the same time the matching relation (Herrnstein, 1961) was being developed, classical signal detection theory (SDT) (Green & Swets, 1966) produced a relatively successful quantitative model that was designed to separate stimulus effects from other effects (e.g., consequences). In a typical signal detection task, a signal (i.e., stimulus) is either presented with noise (i.e., some other stimuli) or the noise is presented without the signal in discrete trials. This allows four possible outcomes on each trial. When the signal-with-noise is presented, the subject can report (verbally with humans, some other operant response with nonhumans) the signal?s presence (i.e., a hit) or its absence (i.e., a miss). When the noise-without- signal is presented, the subject can also report the signal?s presence (i.e., a false alarm) or its absence (i.e., a correct rejection). Hits and correct rejections are followed by reinforcement and misses and false alarms are followed by extinction or punishment. Plotting the probability of hits as a function of the probability of false alarms reveals orderly relationships (see Green & Swets, 1966, for elaboration). 68 ?The most striking achievement of the theory is that if signal strength is constant, [the measure of sensory sensitivity to the difference between stimuli] remains the same within variations of any given experiment?More importantly, it remains constant across different classes of experiment? (Nevin, 1969, p. 477). However, SDT provided elaborate conceptual machinery to explain biasing effects (e.g., effects from consequences) that Nevin (1981) thought might better be explained by incorporating the matching relation. His work began an effort to develop an integrated quantitative model of behavioral detection according to what was later to become contingency discriminability theory (CDT) (Alsop, 1991; Davison & Jenkins, 1985; Davison & Nevin, 1999; Davison & Tustin, 1978; Nevin, 1981, 2005). To summarize CDT it is first necessary to recount the features of an operant choice situation in which two or more responses share a common stimulus context. In such a situation, responses are mutually exclusive, meaning that they cannot be performed simultaneously, and therefore time and effort devoted to one response precludes the occurrence of others. Aside from this competition, however, concurrent responses have traditionally been viewed as essentially independent, in the sense that each is maintained by a different reinforcement contingency (Baum 1974; Herrnstein, 1970; Pierce & Epling, 1983; Redmon & Lockwood, 1986). Thus, reinforcer R 1 strengthens response B 1 but not response B 2 , and reinforcer R 2 strengthens response B 2 but not response B 1 . CDT challenges this assumption of independence, instead viewing the response- consequence contingency as similar to the stimulus-to-be-reported in a signal detection task. SDT views a stimulus report as controlled partly by the stimulus and partly by the 69 noise. The response on a given trial is influenced both by the similarity of the stimuli and by the reporter's imperfect perceptual/attentive abilities to discriminate among them. Similarly, CDT views a given response as controlled partly by the consequence that is contingent upon it and partly by the ?noise? of other simultaneously available response- consequence contingencies. The relative strength of a response is influenced both by its contingent consequence and by other response-consequence contingencies to the extent that the organism's imperfect "response-consequence contingency detection" abilities allow it to discriminate among them. According to Davison & Nevin (1999), reinforcement contingencies are "confusable" to the extent that some factor diminishes the ?distinctiveness of the relation between behavior and reinforcement for one discriminated operant relative to another? (Davison & Nevin, 1999, p. 445). In the absence of confusability, behavior allocation will exactly match relative reinforcement frequency, yielding a GMR slope of 1. Theoretically, all dimensions of reinforcement (e.g., frequency, magnitude, type, delay) contribute to confusability, so if they are all identical across competing responses, the responses will be undifferentiated and will occur at identical rates. In the typical operant choice experiment, reinforcement frequency is varied across competing behaviors while all other dimensions are held constant (i.e., are completely confusable). From the perspective of CDT then, it is not surprising that the modal outcome of these experiments is pronounced undermatching, or slope < 1. It is axiomatic to CDT that anything that increases response-consequence discriminability should increase response differentiation in operant choice. Although CDT has proven to have great heuristic value ? it has suggested new research questions 70 and allowed the integration of a number of seemingly disparate findings (Davison & Nevin, 1999) ? this central precept has been inadequately tested. For example, CDT clearly predicts that the slope of the GMR should be steeper when two concurrent responses are maintained by qualitatively different reinforcers than when they are maintained by qualitatively identical reinforcers. Although several experiments have employed the former, most have lacked the latter as a control function (e.g., Farley & Fantino, 1978; Hollard & Davison, 1971; Logue & deVilliers, 1981; Miller, 1976; Ruddle, Bradshaw, Szabadi, & Foster, 1982). Apparently, the increased behavioral sensitivity that is predicted by CDT when concurrently available reinforcers are qualitatively different has been tested in only three experiments, one with possums as subjects (Bron, Foster, Sumpter, & Temple, 2003) and two with humans as subjects (Critchfield & Lane, 2005; Magoon & Critchfield, 2005). Bron et al. (2003) assessed the GMR three times within subject. In "baseline" phases, both responses led to a 3-s presentation of a magazine containing barley. In the two "experimental" phases, the barley in one magazine had been soaked in a salt-water solution (either 4% or 6%). Analyses focusing on response allocation showed a steeper- than-baseline slope for 2 of 4 subjects who completed the 4% phase, and for 4 of 5 subjects who completed the 6% phase. It should be noted that where slope increases were observed, the differences ranged from .01 to .11. These differences are small enough to warrant concern over their functional importance. Additionally, so little is known about operant behavior in possums and the procedures used to study it, that these findings support CDT provisionally at best. 71 Magoon and Critchfield (2005) assessed the GMR twice within subject for three subjects. In ?baseline? phases, both responses were maintained through positive reinforcement. In ?experimental? phases, one response was maintained through positive reinforcement and the other was maintained through negative reinforcement. 1 Slopes were steeper in the latter case for all three subjects. Critchfield and Lane (2005) replicated the effect with an additional three subjects with only minor procedural modifications. Before the results from Magoon and Critchfield (2005) and Critchfield and Lane (2005) can be interpreted as supporting CDT, two procedural ambiguities must be resolved. First, responses consisted of mouse clicks directed at either of two moving target boxes on a computer screen, and in each experimental condition, the target boxes were different colors that varied unsystematically from condition to condition. According to CDT, in addition to the response-consequence contingency discriminability described above, the distinctiveness of the stimulus-response contingencies and of the responses themselves contributes to functional reinforcer allocation and thus to steady- state responding. It is possible, therefore, that varying the target box color pairs across experimental conditions could have been responsible for the slope increasing effects observed rather than the response-consequence contingency differences. Second, in the procedure employed by Magoon and Critchfield (2005), subject earnings were supplemented at the beginning of each session, so that money losses encountered through negative reinforcement would not result in end-of-session totals below zero. This practice, initiated in an attempt to keep subjects from withdrawing from the experiment when earnings were low, may have created a different economic context 72 when negative reinforcement was programmed than when it was not. This, too, could have contributed to the slope increasing effects observed rather than the response- consequence contingency differences. It is interesting to note that when Critchfield and Lane (2005) omitted the money supplements, slope effects were less pronounced than in the Magoon and Critchfield (2005) experiment. The purpose of the present investigation was to replicate and extend the effects reported by Magoon and Critchfield (2005) and Critchfield and Lane (2005). Experiment 1 employed the same procedures as those two experiments except that (a) the targets used to define responses were identically colored within and across experimental conditions, and (b) no money supplements were used. Following replication of the slope effect, Experiments 2 and 3 were an attempt to determine what particular feature(s) of negative reinforcement makes it qualitatively different from positive reinforcement as expressed by the slope increasing effect predicted by CDT. Experiment 4 was a control experiment that compared the slopes of two matching functions with the same within-phase consequences, but with different between-phase consequences. GENERAL METHOD Subjects Participants. Subjects volunteered in response to notices posted on a recruitment board and were selected for participation based only on scheduling convenience. Fourteen individuals were accepted into the investigation and two dropped out prematurely without giving reasons. Money earnings and payment. Subjects were paid $2.00 per hour of participation, in cash, immediately following each visit to the laboratory. Additional monies were 73 earned within experimental sessions. Reinforcement contingencies were programmed to result in aggregate earnings of approximately $4.00 per hour. Thus, combining attendance pay and session earnings, subjects earned about $6.00 per hour of participation. Payment of session earnings was delivered following the conclusion of each individual's participation. Setting and Apparatus Experimental sessions were conducted in a 3m x 3m room with stereo speakers affixed to the ceiling. Two large study carrels, each with a chair, a color computer monitor, a computer mouse, and a small fan, served as subject workspaces. The carrels were separated by curtain dividers which kept subjects from observing one another if more than one subject was running at a time. Each subject had the choice of having the fan on and listening to soft instrumental music from the speakers, having the fan on and listening to their own music from the speakers, or listening to their own personal audio playback device with headphones (e.g., an MP3 player). IBM ? -compatible computers in an adjacent room controlled experimental events and collected the data according to a custom program created using QuickBasic ? . Procedure Subjects visited the laboratory four or five days per week for 2 hours per day. Experimental sessions lasted 8 min and were separated by short breaks of about 1 to 5 min during which the experimenter recorded the data and initiated the next session. Instructions. Appendix A describes the intake and training procedures and the instructions used for each experiment. 74 Experimental task. The concurrent-schedules procedure was nearly identical to that of Magoon and Critchfield (2005), with slight modifications for each experiment. Sessions began with the display of the message ?Click here to begin? centered on the screen and directly above a rectangular box with the word ?Ready? inside of it. An arrow-shaped cursor indicated the virtual position of the mouse at all times during the session. Positioning the cursor on the box and pressing the left mouse button cleared the prompts and produced two white rectangles, or work areas, each approximately 13 cm wide by 17 cm tall arranged one on each side of the screen both against a background the color of which varied with experimental condition (see Tables 1-4). Within each of the two work areas was a small (approximately 1.6 square cm) red target box. Throughout each session, the target boxes moved in random directions about 1.25 cm per s. Pressing the left mouse button with the cursor positioned on either of the two target boxes constituted the two measured responses. Pressing the left mouse button while the cursor was positioned elsewhere was ineffective (unless it was a changeover response, see below) and was not counted. The side on which a subject?s first response occurred remained white and the message, ?Mouse On,? appeared just below it. The other work area turned black and the message, ?Mouse Off,? appeared below it. Both target boxes remained visible and reinforcement schedules associated with each one continued to operate. Responses on the ?Mouse On? side were recorded and influenced the schedule of reinforcement programmed for that side, while responses on the ?Mouse Off? side were not recorded and had no programmed effect. Located between the two work areas was a 2.5 cm square changeover (CO) area labeled with the word ?Change.? Positioning the cursor on the CO area and pressing the left mouse button five consecutive times 75 reversed the status of the two work areas. A response on the active target prior to completing five consecutive changeover responses reset the CO counter to zero. Session and reinforcement schedule timers continued to operate during CO responding. At the end of each session, the screen cleared and a message was presented indicating that the session was over and displaying the earnings for that session. Schedules. Responding on the moving target boxes influenced money gains or losses (depending on the schedule in effect) according to independent, concurrent variable-cycle (VC) schedules using constant-probability distributions of intervals (Fleshler & Hoffman, 1962). VC schedules of reinforcement (deVilliers, 1972) are programmed to deliver one consequence per interval contingent on the first response in that interval. Subsequent responses in the interval are counted but have no further influence on the programmed schedule. If no response is made in a given interval, the programmed consequence does not occur. Under all conditions, the programmed overall rate of reinforcement was 360 reinforcers per hour of session time aggregated across the two work areas. Within experimental phases, 2:1 and 5:1 relative ratios of reinforcement were programmed via concurrent VC15:VC30 and VC12:VC60 schedules of reinforcement, respectively. Across conditions, attempts were made to avoid consistent relationships between the work area (i.e., left vs. right), reinforcement frequency (i.e., rich vs. lean schedule), and reinforcement type (e.g., ?standard? positive vs. ?standard? negative). Tables 1-4 show the specific conditions experienced by each subject in each experiment and the sequence in which they were completed. 76 Consequences. Sometimes schedules were programmed such that responses resulted in an accrual of money and failing to respond resulted in a lost opportunity to accrue money. Sometimes schedules were programmed such that money was deducted for failing to respond and responses resulted in canceling the deduction of money. The manner in which these consequent events were signaled varied across experiments and will be described separately for each. The magnitude of each accrual and deduction was established separately for each experiment so that earnings per hour remained relatively constant for all subjects across all experiments. 2 The magnitude of consequent events is indicated separately for each experiment. Cumulative earnings within a session were not displayed on the subject's screen, although the experimenter kept a running total of these earnings. Subjects were free to ask how much they had earned up to a point at any time either before or after, but not during, a daily visit. Subjects rarely asked. Stability criteria. A condition was terminated when the following criteria were met: (a) for both response-allocation and time-allocation proportions, over 4 consecutive sessions, the difference in means between the first and second pair of sessions differed by no more than 10% of the 4-session mean; and (b) visual analysis revealed no systematic trend in the data. Data Analysis The present investigation was concerned primarily with understanding the conditions affecting increased behavioral sensitivity. Commensurate with this emphasis, the focus of the data analysis was on the slopes of the linear functions (and their within- 77 subject, between-phase differences) created from using least-squares linear regression to fit Equation 2 to individual data (Baum, 1974, 1979). The intercepts of these functions (a measure of bias) are not central to the research question, but will be described for each experiment out of convention. Experiment 1 Magoon and Critchfield (2005) examined human matching under concurrent VC schedules of positive versus positive reinforcement (P:P) and negative versus positive reinforcement (N:P). For all subjects, matching slopes were steeper, and therefore responding was more sensitive to relative reinforcement frequency differences, under N:P reinforcement (i.e., N:P a-values > P:P a-values). The slope differences observed by Magoon and Critchfield (2005) appeared to support the prediction of CDT, that, when responses are simultaneously and concurrently available, sensitivity to reinforcement should be increased when their respective consequences are qualitatively dissimilar relative to when they are qualitatively identical. The purpose of the present experiment was to replicate the Magoon and Critchfield (2005) study, correcting for two procedural ambiguities described previously. Specifically, to keep the stimulus context of the concurrent responses as similar as possible, the moving target boxes were always a single color (red), and to keep the economy of P:P and N:P phases as similar as possible, the post-session money supplements used by Magoon and Critchfield (2005) were omitted. Method Subjects were three college students. In conditions of one phase, VC schedules of ?standard? positive reinforcement were programmed for both work areas (P:P). In 78 conditions of the other phase, VC schedules of ?standard? negative reinforcement were programmed for one work area and VC schedules of ?standard? positive reinforcement were programmed for the other (N:P). Consequence magnitudes were 4?. Table 1 shows experimental arrangements and the sequence of conditions completed by each subject for Experiment 1. In VC schedules of ?standard? positive reinforcement, the first response in an interval immediately caused a money gain to be signaled by the disappearance of the target box associated with that schedule and flashing black text stating, ?+4? (5 text flashes at 0.25 s per flash with 0.25 s between flashes, consequence magnitude = 4?). During this display, the cursor disappeared, responses were ineffective, and all timers were suspended. Following the conclusion of the money-gain message, further target responses were counted but did not influence the schedule for the remainder of the interval. If no response was made during an interval, the money gain programmed for that interval was cancelled at the end of the interval without stimulus change. In VC schedules of ?standard? negative reinforcement, the first response in an interval immediately cancelled the programmed money deduction for that interval without stimulus change. Further target responses were counted but did not influence the schedule for the remainder of the interval. If no response was made during an interval, a money deduction was signaled at the end of the interval by a flashing alternation of the target box associated with that schedule and black text stating, ?-4? (5 text flashes and 5 red target flashes at 0.25 s per flash each, consequence magnitude = 4?). During this display, the cursor disappeared, responses were ineffective, and all timers were suspended. 79 Results Figure 1 shows for each subject (a) plots of the logarithms of the relative response ratios (left panels) and relative time allocation ratios (right panels) as functions of the logarithms of the relative obtained reinforcement ratios using averaged data from the last four (stable) sessions of each condition (see Appendix B for raw data values) and (b) lines of best fit, as determined by Equation 2, to the data of each phase. Closed circles and heavy lines represent the data for the P:P conditions, and closed triangles and light lines represent the data for the N:P conditions. In Figure 1, the data in the P:P phases plot the logarithms of the ratios of left responses (or time) to right responses (or time) (L/R) against the logarithms of the ratios of obtained left reinforcers to obtained right reinforcers (L/R). However, plotting left/right ratios for the data in the N:P phases, in principle, could distort any differential effects of ?standard? negative reinforcement because the location of ?standard? negative reinforcement (L v. R) varied unsystematically with relative reinforcement ratios (rich v. lean) across conditions within phase. Specifically, plotting the data in the N:P phases by left/right ratios might alter the shape (i.e., slope) or, more likely, the location (i.e., intercept) of the matching functions to reflect more heavily the effects of location than reinforcement type (see Magoon and Critchfield, 2005, for elaboration). Consequently, the N:P functions in Figure 1 show the logarithms of the ratios of ?standard? negative reinforcement responses (or time) to ?standard? positive reinforcement responses (or time) (N/P) against the corresponding logarithms of the ratios of obtained ?standard? negative reinforcement reinforcers to obtained ?standard? positive reinforcement reinforcers (N/P). 80 Table 5 lists the regression equations (from Equation 2) and r 2 values for individual data, separately for response allocation and time allocation, for both P:P and N:P phases. The proportions of variance accounted for ranged from acceptable (e.g., SO104 P:P time r 2 = .865) to excellent (e.g., TM102 P:P responses r 2 = .998), with generally better fits for N:P phases (.950 ? r 2 ? .987, M = .971) than for P:P phases (.865 ? r 2 ? .998, M = .919). Parameter estimates are addressed below. Slopes (Sensitivity) Visual analysis. Visual inspection of both response-based and time-based graphs in Figure 1 shows that, in all cases, the N:P slopes are steeper than the P:P slopes, although the differences for TB101 for responses and time are small. Descriptive analysis. The fifth column of Table 5 lists the difference between the N:P phase a (slope) coefficient and the P:P phase a coefficient for each subject (a (N:P) ? a (P:P) ). A positive a-difference indicates a steeper a for the N:P phase than for the P:P phase, and a negative a-difference indicates a shallower a for the N:P phase than for the P:P phase. As with visual inspection, a-difference outcomes were consistent across subjects. They were all positive (.264 ? a (N:P) ? a (P:P) ? .722, M = .490), indicating that, for each subject, the N:P slope was greater than its P:P counterpart. This effect generally was more pronounced for response-based analyses (.296 ? a (N:P) ? a (P:P) ? .722, M = .550) than for time-based analyses (.264 ? a (N:P) ? a (P:P) ? .596, M = .430). All but one of the N:P slopes (SO104?s slope for time) and none of the P:P slopes were greater than 1. Normative analysis. The goal of the present investigation was to determine whether manipulation of one or more elements of different reinforcer types would create a change in slope of the function derived from fitting Equation 2 to the data from two 81 experimental phases completed by the same individual. Making this determination would be facilitated by knowing the test-retest reliability of the slope parameter of Equation 2. In descriptive terms, a slope effect could be defined as a change in the slope estimate that exceeds the typical range of within-subject variance. Unappreciated at the time this investigation was initiated, however, was the fact that, despite the publication of several reviews of the many dozens of experiments employing Equation 2 (e.g., Baum, 1979; Kollins, et al., 1997; Robinson, 1992; Wearden & Burgess, 1982), the test-retest reliability of this model's fitted parameters apparently has not been assessed systematically. Only a few published studies have assessed the matching relation twice for the same subjects under identical conditions (e.g., McLean & Blampied, 2001; Takahashi & Iwamoto, 1986; Todorov, Oliveira Castro, Hanna, Bittencourt de Sa, & Barreto, 1983), and too few of these cases exist to support any actuarial conclusions. This surprising weakness in the literature creates problems for the present investigation. In lieu of authoritative information on the stability of matching functions over time for individual subjects, what can be determined from the published literature is the amount of between-subject variance that occurs in matching slopes. Robinson (1992) reviewed all nonhuman studies in which Equation 2 was employed, and found that these estimates were approximately normally distributed. Robinson?s (1992) data set is large enough (N = 125 individual response slopes) that it is possible to derive an estimate of the typical range of intersubject slope differences. Robinson's (1992) Figure 1 (p. 439, top left panel) serves as the basis for such an estimate. By assuming a normal distribution, using the median as the measure of central tendency, and counting ? the number of cases to either side of the median that together 82 constitute 95% of the total number of cases (i.e., 2 standard deviations to each side), the ?normal? range of slope values can be estimated at .427 ? 1.202. Robinson cited the mean of the distribution as .81, therefore the range of slope values two standard deviations below the mean is estimated at .383 (.81 ? .427 = .383) and the range of slope values two standard deviations above the mean is estimated at .392 (1.202 ? .81 = .392). The average, then, of what could be considered two standard deviations is .388 (.392 + .383/2 = .3875). Thus, for the ?normative analysis? in the present investigation, a difference between slopes of ? .388 for response-based analyses was considered out of the ?normal? range of variation and used as a makeshift significance criterion. No evaluations of slope differences for time-based analyses will be made using criteria from Robinson (1992) because, for most concurrent-schedules procedures, response allocation and time allocation are strongly correlated (for a review, see Robinson, 1992; for an example, see Critchfield, Paletz, MacAleese, & Newland, 2003). Two subjects had slope differences larger than .388 (TM102 difference = .722, SO104 difference = .632) and one did not (TB101 difference = .296). Inferential statistical analysis. In addition to the three previous analyses, slope effects were evaluated for each individual using multiple regression/correlation analysis (MRC) (Cohen, Cohen, West, & Aiken, 2003). This type of analysis regresses the dependent variable (for matching analyses, response allocation and/or time allocation) on two or more independent variables (in the present investigation, reinforcement frequency and reinforcer type) simultaneously, according to two different models, and then evaluates the individual coefficients via t-tests and the changes in the proportion of variance accounted for between the two models via F-tests. One model employs one 83 parameter for each independent variable (in the present investigation, B 1 for relative reinforcement frequency and B 2 for reinforcement type) to evaluate the main effect of each and a constant (B 0 ) to evaluate an overall mean difference between conditions apart from the main effects (i.e., partialed bias). The second model is the same as the first model except that it incorporates a fourth parameter (B 12 ) as a measure of any possible interaction effects between independent variables (in the present investigation, reinforcement frequency x reinforcement type). The interaction parameter is the measure of the difference between the slopes of the individual regression lines, but considered separately (i.e., partialed) from the main effects of each independent variable and the overall mean difference. The measure of fit for the regression of each model is R 2 ? the proportion of the dependent variable?s variance shared with the optimally weighted independent variables. For present purposes, the crucial issue is whether the interaction parameter of the second model (a) is statistically significantly different from zero, and (b) accounts for a statistically significant increase in R 2 over and above that found by the first model. Both tests yield identical p values so only the results of the latter test will be reported. Table 6 summarizes the results of the MRC analysis. All R 2 values were acceptable (e.g., TM102 M.E. responses R 2 = .872) to excellent (e.g., TM102 Full time R 2 = .989). In Table 6, the ?R 2 Change? value is the change in variance accounted for by the addition of the B 12 interaction parameter and the p value is the probability of getting a difference of that size if there is no actual difference in variance. By this test, two of the three subjects showed a statistically significant slope increase for the responses-based 84 analysis (TM102 and SO104) and one showed a statistically significant slope increase for the time-based analysis (TM102) (although it should be noted that SO104?s time-based test approached statistical significance, p = .063). This test indicated that the perceived slope differences between N:P and P:P functions in Figure 1 were ?real? ? in the sense conveyed by inferential statistics ? for only two of the subjects for responses and for one subject for time. However, if SO104?s p value for the time-based analysis is considered in the context of the conclusions of the visual and descriptive analyses, it would not be unreasonable to conclude that the perceived slope difference for SO104?s time-based analysis is functionally important. Intercepts (Bias) Visual analysis. Visual inspection of both response-based and time-based graphs in Figure 1 shows no systematic bias for either type of reinforcement. The graphs show no pronounced intercept difference between functions for TB101 and TM102, and a higher intercept for N:P than for P:P for SO104. Descriptive analysis. The last column of Table 5 lists the difference between the N:P phase log c (intercept) coefficient and the P:P phase log c coefficient for each subject (c (N:P) ? c (P:P) ). A positive log c-difference indicates a higher log c for the N:P phase than for the P:P phase, and a negative log c-difference indicates a lower log c for the N:P phase than for the P:P phase. As with visual inspection, log c-difference outcomes were inconsistent across subjects. Two subjects had positive log c-differences for both response-based and time-based analyses (TB101 responses = .158, time = .136; SO104 responses = .588, time = .461) while the third had negative log c-differences (TM102 responses = -.140, time = -.171). Two subjects had larger log c-differences for response- 85 based analyses than for time-based analyses (TB101 and SO104) while the third had a larger log c-difference for the time-based analysis than for the response-based analysis (TM102). Two subjects had log c-differences close to zero for both response-based and time-based analyses (TB101 and TM102) while the third had much larger log c- differences (SO104). Normative analysis. The normative approach employed to evaluate slope effects for response-based analyses was not applied to bias estimates. Inferential statistical analysis. The intercept of any MRC regression model represents the estimated measure of the dependent variable for the reference group apart from the effects of each independent variable and their interactions (i.e., reference group bias). In the context of the present investigation, the intercept represents the logarithm of the relative ratio of responding when there is equal reinforcement obtained from each of the two measured concurrent response options (because X/X = 1 and log 1 = 0). MRC analysis provides two relevant statistical tests of intercepts. The first test of intercepts evaluates whether the intercept of the regression function from each phase (e.g., N:P and P:P) is significantly different from zero (two- tailed test). In the present case, this is done by building two full models: one using the data from the P:P phase (L/R ratios as described in the Experiment 1 Results section) as the reference group and one using the data from the N:P phase (N/P ratios) as the reference group. The intercept parameter (B 0 ) of each model represents the response bias resulting from reinforcement type (N:P data) or side (P:P data) and the t-test of that parameter evaluates whether that intercept is significantly different from zero. A significant positive t-value indicates that the intercept is statistically significantly larger 86 than zero (i.e., there is a bias toward whatever is represented in the numerator of that phase). A significant negative t-value indicates that the intercept is statistically significantly smaller than zero (i.e., there is a bias away from whatever is represented in the numerator of that phase). For the t-tests of the intercepts against the nil hypothesis, none of the P:P regressions for either response-based analyses or time-based analyses was significantly different from zero (i.e., there was no side bias), one of the N:P regressions for response- based analyses was significantly different from zero (SO104, t = 5.112, bias toward ?standard? negative reinforcement, p = .007), and two of the N:P regressions for time- based analyses were significantly different from zero (TM102, t = -3.423, bias away from ?standard? negative reinforcement, p = .027; SO 104, t = 5.197, bias toward ?standard? negative reinforcement, p = .007). The second test of intercepts evaluates if there is a significant difference between the intercepts of the two phases (two-tailed test). A full model with either phase as the reference group provides the relevant t-test. The data from either phase are acceptable as the reference group because the effect is judged by the coefficient for the main effect of reinforcement type (B 2 ). The absolute value of that parameter, and thus the p value, is the same regardless of which phase is used as the reference group; only the sign differs. For the t-tests of the intercepts against each other, the data from the conditions of the P:P phase were used as the reference group. This means that a significant positive t- value indicates that the intercept of the N:P phase is statistically significantly larger than the intercept of the P:P phase (i.e., there is a bias toward ?standard? negative over ?standard? positive reinforcement) and a significant negative t-value indicates that the 87 intercept of the N:P phase is statistically significantly smaller than the intercept of the P:P phase (i.e., there is a bias toward ?standard? positive over ?standard? negative reinforcement). These t-tests of the intercepts resulted in one subject showing a significant difference between the two phases for response-based analyses (SO104, t = 4.109, bias toward ?standard? negative over ?standard? positive reinforcement, p = .015), and two subjects showing a significant difference between the two phases for time-based analyses (TM102, t = -2.985, bias toward ?standard? positive over ?standard? negative reinforcement, p = .041; SO104, t = 4.194, bias toward ?standard? negative over ?standard? positive reinforcement, p = .014). Experiment 1 Results Summary Visual and descriptive analyses showed a difference in slopes for all three subjects for both response-based and time-based analyses. The normative and MRC analyses showed a difference in slopes for two of three subjects for the response-based analysis. The MRC analysis showed a difference in slopes for only one subject for the time-based analysis (though it would not be unreasonable, in the context of all analyses, to conclude that SO104?s time-based slopes were different, meaning that two subjects rather than one showed the difference). No systematic bias effects were observed under any analysis. Discussion Visual and descriptive analyses of slopes showed increased behavioral sensitivity to concurrently available ?standard? negative versus ?standard? positive reinforcement over that of concurrently available ?standard? positive versus ?standard? positive reinforcement for all subjects for both response-based and time-based analyses. This 88 finding replicates the results of Magoon and Critchfield (2005) and Critchfield and Lane (2005). According to the normative and MRC analyses, a slope effect was observed for two of three subjects for the response-based analysis (TM102 and SO104) and for only one subject for the time-based analysis (TM102). The failure of the other subject's slope effect to meet the criterion of the normative approach or statistical significance might be attributed to measurement error, individual differences, or some other factor. It would have been desirable to add more subjects to the experiment to clarify such possibilities, but resource constraints precluded this. Ultimately, the normative and MRC analyses of the present experiment highlight two things. First, not all modes of analysis will always agree. Second, in comparing empirical functions within subject in free-operant experiments, the question of how different is "really" different poses some challenges. This matter will be taken up in the General Discussion. Visual and descriptive analyses supported two conclusions regarding slope. First, in three experiments using nearly identical procedures (the present one; Critchfield & Lane, 2005; Magoon & Critchfield, 2005), N:P slopes have been steeper than P:P slopes for 9 of 9 subjects. Assuming that no two matching slopes will be precisely identical, the probability of this aggregate outcome occurring by chance is 0.5 9 = .00195, or about 1 chance in 500. Second, the present results are similar to those of Magoon and Critchfield (2005) despite changes in stimulus features designed to control for possible differential conditional stimulus control and despite no use of monetary supplements for ?standard? negative reinforcement conditions. This outcome suggests that the suspected procedural ambiguities in Magoon and Critchfield (2005) did not play an integral part in their 89 results, bolstering the conclusion that the slope effects reflect response-consequence discriminability only. Experiment 2 The most important finding from Experiment 1 was that the procedural changes from Magoon and Critchfield (2005) did not preclude the slope differences between phases. The slope differences were found despite the use of identical target box colors and no use of monetary supplements. This outcome suggests that between phase differences are not likely the result of systematic differences in stimulus-response contingency discriminability, but are likely due to differences in response-consequence contingency discriminability. The question now becomes what particular feature(s) of ?standard? negative reinforcement make(s) it qualitatively different from ?standard? positive reinforcement as expressed by the slope increasing effect observed in Experiment 1. The two most salient differences between ?standard? positive and ?standard? negative reinforcement, as programmed in Experiment 1, are differences of gains versus losses and feedback differences. For differences of gains versus losses, in the context of the present investigation, ?standard? positive reinforcement involves gaining money for response patterns that meet schedule requirements (e.g., responding at a high rate) and not gaining money for response patterns that fail to meet schedule requirements (e.g., no responding). Money can never be lost. ?Standard? negative reinforcement involves losing money for response patterns that fail to meet schedule requirements and avoiding losing money for response patterns that meet schedule requirements. Money can never be gained. For feedback differences, ?standard? positive reinforcement results in a 90 stimulus change only when a reinforcer is ?earned? (i.e., a gain is signaled), but ?standard? negative reinforcement results in a stimulus change only when a reinforcer is ?missed? (i.e., a loss is signaled). Consequently, more feedback occurs under ?standard? positive reinforcement than it does under ?standard? negative reinforcement when more than 50% of the programmed number of reinforcers are earned from each type of reinforcement, but more feedback occurs under ?standard? negative reinforcement than it does under ?standard? positive reinforcement when less than 50% of the programmed number of reinforcers are earned from each type of reinforcement. Experiments 2 and 3 were designed to test the necessity and sufficiency of these two variables in producing the slope differences found in Experiment 1, Magoon and Critchfield (2005), and Critchfield and Lane (2005). Experiment 2 was designed to test the sufficiency of feedback differences and the necessity of gains versus losses to produce the slope differences found between the N:P and P:P phases of Experiment 1, Magoon and Critchfield (2005), and Critchfield and Lane (2005). This was done by restructuring the consequences to keep the feedback distinction but to eliminate the gain/loss distinction. Gains were programmed for all contingencies of all conditions in both phases. However, the feedback structure in the conditions of the ?experimental? phase was designed to mimic the feedback structure of ?standard? negative reinforcement. This type of contingency is referred to here as ?inverse? positive (IP) reinforcement. Method Subjects were three college students. In conditions of one phase, VC schedules of ?standard? positive reinforcement were programmed for both work areas (P:P). In 91 conditions of the other phase, VC schedules of ?inverse? positive reinforcement were programmed for one work area and VC schedules of ?standard? positive reinforcement were programmed for the other (IP:P). Consequence magnitudes were 2?. Table 2 shows experimental arrangements and the sequence of conditions completed by each subject for Experiment 2. VC schedules of standard ?standard? positive reinforcement were programmed as described in Experiment 1. In VC schedules of ?inverse? positive reinforcement, the first response in an interval immediately caused an unsignaled money gain. Further target responses were counted but did not influence the schedule for the remainder of the interval. If no response was made during an interval, the loss of an opportunity to earn money was signaled at the end of the interval by a flashing alternation of the target box associated with that schedule and black text stating, ?No +2? (5 text flashes and 5 red target flashes at 0.25 s per flash each, consequence magnitude = 2?). During this display, the cursor disappeared, responses were ineffective, and all timers were suspended. Results Figure 2 plots the data for Experiment 2 (see Appendix B for raw data values) according to the conventions of Experiment 1, but with the response and reinforcement rates from the ?inverse? positive reinforcement contingencies calculated in the numerators of the response (and time allocation) and reinforcement ratios for the ?experimental? phase data. Also as in Experiment 1, least-squares regression lines, as determined by Equation 2, were fit to the data of each phase. Closed circles and heavy 92 lines represent the data for the P:P conditions, and closed diamonds and light lines represent the data for the IP:P conditions. Table 7 lists the regression equations (from Equation 2) and r 2 values for individual data, separately for response allocation and time allocation, for both P:P and IP:P phases. The proportions of variance accounted for ranged from acceptable (e.g., GE105 P:P time r 2 = .864) to excellent (e.g., GE105 IP:P time r 2 = .996), with similar fits for both phases (P:P = .864 ? r 2 ? .987, M = .956; IP:P = .915 ? r 2 ? .996, M =.959). Parameter estimates are addressed below. Slopes (Sensitivity) Visual analysis. Visual inspection of both response-based and time-based graphs in Figure 2 shows that, in all cases, the IP:P slopes are steeper than the P:P slopes. Descriptive analysis. The fifth column of Table 7 lists the difference between the IP:P phase a (slope) coefficient and the P:P phase a coefficient for each subject (a (IP:P) ? a (P:P) ). A positive a-difference indicates a steeper a for the IP:P phase than for the P:P phase, and a negative a-difference indicates a shallower a for the IP:P phase than for the P:P phase. As with visual inspection, a-difference outcomes were consistent across subjects. They were all positive (.572 ? a (IP:P) ? a (P:P) ? .931, M = .749), indicating that, for each subject, the IP:P slope was greater than its P:P counterpart. This effect generally was more pronounced for response-based analyses (.761 ? a (IP:P) ? a (P:P) ? .931, M = .871) than for time-based analyses (.572 ? a (IP:P) ? a (P:P) ? .729, M = .628). All but two of the IP:P slopes (GE105 and GR108?s slopes for time) and none of the P:P slopes were greater than 1. 93 Normative analysis. According to the normative criterion derived from Robinson's (1992) review of the matching literature, a difference of slopes ? .388 is a meaningful difference. All slope differences in Experiment 2 were larger than .388. Inferential statistical analysis. Table 8 summarizes the results of the MRC analysis. Not all R 2 values met the tacit standard of acceptability found in the literature (i.e., R 2 ? .80). GR108?s main effects only regression models for the response- and time- based analyses accounted for just 76.2% and 79.3% of the variance, respectively. However, GR108?s full regression models, including the interaction term, accounted for 92.5% of the variance for the response-based analysis and for 94.3% of the variance for the time-based analysis. This large increase in variance accounted for between models under both analyses indicates a strong interaction effect. The remaining regression fits were acceptable (e.g., MJ106 M.E. time R 2 = .828) to excellent (e.g., GE105 Full time R 2 = .995). In Table 8, the ?R 2 Change? value is the change in variance accounted for by the addition of the B 12 interaction parameter and the p value is the probability of getting a difference of that size if there is no actual difference in variance. By this test, all three subjects showed a statistically significant slope increase for both the responses-based and time-based analyses. Intercepts (Bias) Visual analysis. Visual inspection of both response-based and time-based graphs in Figure 2 shows a systematic bias for ?inverse? positive reinforcement. Since the IP:P phase data are plotted with ?inverse? positive reinforcement data in the numerators of both the response ratios and the reinforcement ratios, the higher point at which the 94 regression lines for those phases cross the ordinates indicates higher rates of responding to contingencies of ?inverse? positive reinforcement than to ?standard? positive reinforcement regardless of the relative reinforcement rate between concurrent schedules. Descriptive analysis. The last column of Table 7 lists the difference between the IP:P phase log c (intercept) coefficient and the P:P phase log c coefficient for each subject (c (IP:P) ? c (P:P) ). A positive log c-difference indicates a higher log c for the IP:P phase than for the P:P phase, and a negative log c-difference indicates a lower log c for the IP:P phase than for the P:P phase. As with visual inspection, log c-difference outcomes were consistent across subjects. All three subjects had positive log c- differences for both response-based analyses and time-based analyses (.181 ? c (IP:P) ? c (P:P) ? .614) with generally larger log c-differences for response-based analyses (.255 ? c (IP:P) ? c (P:P) ? .614, M =.409) than for time-based analyses (.181 ? c (IP:P) ? c (P:P) ? .425, M =.277). Inferential statistical analysis. For the t-tests of the intercepts against the nil hypothesis, none of the P:P regressions for either response-based analyses or time-based analyses was significantly different from zero (i.e., there was no side bias), two of the IP:P regressions for response-based analyses were significantly different from zero (GE105, t = 19.065, bias toward ?inverse? positive reinforcement, p = .000; MJ106, t = 3.211, bias toward ?inverse? positive reinforcement, p = .033), and one of the IP:P regressions for time-based analyses was significantly different from zero (GE105, t = 23.087, bias toward ?inverse? positive reinforcement, p = .000) (although it should be noted that GR108?s time-based test approached statistical significance in the direction indicating a bias toward ?inverse? positive reinforcement, p = .052). 95 For the t-tests of the intercepts against each other, the data from the conditions of the P:P phase were used as the reference group. This means that a significant positive t- value indicates that the intercept of the IP:P phase is statistically significantly larger than the intercept of the P:P phase (i.e., there is a bias toward ?inverse? positive over ?standard? positive reinforcement) and a significant negative t-value indicates that the intercept of the IP:P phase is statistically significantly smaller than the intercept of the P:P phase (i.e., there is a bias toward ?standard? positive over ?inverse? positive reinforcement). These t-tests of the intercepts resulted in one subject showing a significant difference between the two phases for both the response-based and time-based analyses (GE105 response-based t = 13.287, bias toward ?inverse? positive over ?standard? positive reinforcement, p = .000; time-based t = 16.071, bias toward ?inverse? positive over ?standard? positive reinforcement, p = .000) (although it should be noted that MJ106?s response-based test approached statistical significance, p = .061). The results of these tests indicate that the perceived intercept differences between the IP:P and P:P functions in Figure 2 are ?real? ? in the sense conveyed by inferential statistics ? for only one subject. However, if MJ106?s p value for the response-based analysis is considered in the context of the conclusions of the visual and descriptive analyses, it would not be unreasonable to conclude that the perceived intercept difference for MJ106?s response-based analysis is functionally important. Experiment 2 Results Summary Visual, descriptive, normative, and MRC analyses all showed a difference in slopes for all three subjects for both response-based and time-based analyses. Although not every analysis verified the perceived bias toward ?inverse? positive reinforcement 96 over ?standard? positive reinforcement in every case, the evidence converged on the conclusion that there was a systematic bias for the ?inverse? positive reinforcement contingency. Discussion Visual and descriptive analyses of slopes showed increased behavioral sensitivity to concurrently available ?inverse? positive versus ?standard? positive reinforcement over that of concurrently available ?standard? positive versus ?standard? positive reinforcement for all subjects for both response-based and time-based analyses. All differences between phase slopes exceeded the normative criterion of .388 and all within- subject slope differences were statistically significant. Overall, slope effects were qualitatively similar to those found in Experiment 1, by Magoon and Critchfield (2005), and by Critchfield and Lane (2005), although the magnitudes of the shifts may be larger than seen previously. These findings suggest that a gain-loss difference is not necessary, but feedback asymmetry is sufficient to heighten response-consequence contingency discriminability. The procedures of this experiment do not address whether feedback asymmetry is necessary and gains versus losses are sufficient to produce the increased sensitivity to reinforcement frequency. Experiment 3 will address these issues. Visual and descriptive analyses of bias showed all ?inverse? positive reinforcement intercepts to be higher than their comparable ?standard? positive reinforcement intercepts indicating a systematic bias toward ?inverse? positive reinforcement. This effect appears comparable to that found when concurrent schedules with different magnitude reinforcers are used (e.g., Critchfield & Merrill, 2005; McLean & Blampied, 2001), but has not been reported before in procedures using equal 97 magnitude reinforcers. As the primary focus of the present investigation is on the variables that may influence sensitivity to reinforcement, not bias, no speculation regarding the variables responsible for the intercept shifts in Experiment 2 will be offered here. However, the phenomenon seems robust enough to warrant further study. The failure of two subjects? differences between intercepts to meet statistical significance, and thus to corroborate entirely the conclusions from visual and descriptive analyses, once again highlights the fact that not all modes of analysis will always agree with each other and that comparing empirical functions within subject in free-operant experiments poses some difficult challenges. Experiment 3 The results of Experiment 2 illustrated the sufficiency of feedback asymmetry to produce an increased behavioral sensitivity to relative reinforcement. To show that a variable is sufficient to produce an effect, however, does not make it necessary to produce the effect. Furthermore, just because gains versus losses were shown not to be necessary to produce the effect in Experiment 2, does not rule out their possible sufficiency (as predicted by CDT). Experiment 3 was designed to test the necessity of feedback asymmetry and the sufficiency of gains versus losses to produce increased behavioral sensitivity to relative reinforcement. This was accomplished by holding feedback asymmetry constant across phases while manipulating gains versus losses in one of the phases. The only way to hold feedback constant in a manner that was not more akin to ?standard? negative or ?standard? positive reinforcement was to provide feedback (i.e., present a stimulus) for all possible outcomes of each schedule cycle, 98 regardless of phase. This type of contingency is referred to here as ?total feedback? reinforcement. Method Subjects were three college students. In conditions of one phase, VC schedules of ?total feedback? positive reinforcement were programmed for both work areas (TP:TP). In conditions of the other phase, VC schedules of ?total feedback? negative reinforcement were programmed for one work area, and VC schedules of ?total feedback? positive reinforcement were programmed for the other (TN:TP). Consequence magnitudes were 4?. Table 3 shows experimental arrangements and the sequence of conditions completed by each subject for Experiment 3. In VC schedules of ?total feedback? positive reinforcement, the first response in an interval immediately caused a money gain to be signaled by the disappearance of the target box associated with that schedule and flashing black text stating, ?+4? (5 text flashes at 0.25 s per flash with 0.25 s between flashes, consequence magnitude = 4?). Following the conclusion of the money-gain message, further target responses were counted but did not influence the schedule for the remainder of the interval. If no response was made during an interval, the loss of an opportunity to earn money was signaled at the end of the interval by a flashing alternation of the target box associated with that schedule and black text stating, ?No +4? (5 text flashes and 5 red target flashes at 0.25 s per flash each, consequence magnitude = 4?). During both types of feedback messages, the cursor disappeared, responses were ineffective, and all timers were suspended. 99 In VC schedules of ?total feedback? negative reinforcement, the first response in an interval immediately caused the cancellation of a programmed money deduction to be signaled by the disappearance of the target box associated with that schedule and flashing black text stating, ?No -4? (5 text flashes and 5 red target flashes at 0.25 s per flash each, consequence magnitude = 4?). Following the conclusion of the cancellation message, further target responses were counted but did not influence the schedule for the remainder of the interval. If no response was made during an interval, a money deduction was signaled at the end of the interval by a flashing alternation of the target box associated that schedule and black text stating, ?-4? (5 text flashes and 5 red target flashes at 0.25 s per flash each, consequence magnitude = 4?). During both types of feedback messages, the cursor disappeared, responses were ineffective, and all timers were suspended. Results Figure 3 plots the data for Experiment 3 (see Appendix B for raw data values) according to the conventions of Experiments 1 and 2, but with the response and reinforcement rates from the ?total feedback? negative reinforcement contingencies calculated in the numerators of the response (and time allocation) and reinforcement ratios for the ?experimental? phase data. Also as in Experiments 1 and 2, least-squares regression lines, as determined by Equation 2, were fit to the data of each phase. Open circles and heavy lines represent the data for the TP:TP conditions, and closed squares and light lines represent the data for the TN:TP conditions. Table 9 lists the regression equations (from Equation 2) and r 2 values for individual data, separately for response allocation and time allocation, for both TP:TP and TN:TP phases. The proportions of variance accounted for ranged from good (e.g., 100 JE109 TP:TP time r 2 = .914) to excellent (e.g., SL110 TP:TP time r 2 = .998), with similar fits for both phases (TP:TP = .914 ? r 2 ? .998, M = .953; TN:TP = .916 ? r 2 ? .992, M = .959). Parameter estimates are addressed below. Slopes (Sensitivity) Visual analysis. Visual inspection of both response-based and time-based graphs in Figure 3 shows that, in all cases, the TN:TP slopes are steeper than the TP:TP slopes, although the differences for SL110 for responses and time are small. Descriptive analysis. The fifth column of Table 9 lists the difference between the TN:TP phase a (slope) coefficient and the TP:TP phase a coefficient for each subject (a (TN:TP) ? a (TP:TP) ). A positive a-difference indicates a steeper a for the TN:TP phase than for the TP:TP phase, and a negative a-difference indicates a shallower a for the TN:TP phase than for the TP:TP phase. As with visual inspection, a-difference outcomes were consistent across subjects. They were all positive (.263 ? a (TN:TP) ? a (TP:TP) ? .734, M = .545), indicating that, for each subject, the TN:TP slope was greater than its TP:TP counterpart. This effect generally was more pronounced for response-based analyses (.433 ? a (TN:TP) ? a (TP:TP) ? .734, M = .607) than for time-based analyses (.263 ? a (TN:TP) ? a (TP:TP) ? .631, M = .482). All but one of the TN:TP slopes (JE109?s slope for time) and only one of the TP:TP slopes (SL110?s slope for time) were greater than 1. Normative analysis. According to the normative criterion derived from Robinson's (1992) review of the matching literature, a difference of slopes ? .388 is a meaningful difference. All slope differences in Experiment 3 were larger than .388. 101 Inferential statistical analysis. Table 10 summarizes the results of the MRC analysis. All R 2 values were acceptable (e.g., JE109 M.E. responses R 2 = .859) to excellent (e.g., JE109 Full responses R 2 = .982). In Table 10, the ?R 2 Change? value is the change in variance accounted for by the addition of the B 12 interaction parameter and the p value is the probability of getting a difference of that size if there is no actual difference in variance. By this test, only one of the three subjects showed a statistically significant slope increase for the responses-based analysis (JE109) and two showed a statistically significant slope increase for the time- based analysis (JE109 and NN111) (although it should be noted that NN111?s response- based test approached statistical significance, p = .061). This test indicated that the perceived slope differences between TN:TP and TP:TP functions in Figure 3 were ?real? ? in the sense conveyed by inferential statistics ? for only one subject for responses and for two subjects for time. However, if NN111?s p value for the response-based analysis is considered in the context of the conclusions of the visual and descriptive analyses, it would not be unreasonable to conclude that the perceived slope difference for NN111?s response-based analysis is functionally important. Intercepts (Bias) Visual analysis. Visual inspection of both response-based and time-based graphs in Figure 3 shows no systematic bias for either type of reinforcement. The graphs show no pronounced intercept difference between functions for any subject and all intercepts are close to zero. Descriptive analysis. The last column of Table 9 lists the difference between the TN:TP phase log c (intercept) coefficient and the TP:TP phase log c coefficient for each 102 subject (c (TN:TP) ? c (TP:TP) ). A positive log c-difference indicates a higher log c for the TN:TP phase than for the TP:TP phase, and a negative log c-difference indicates a lower log c for the TN:TP phase than for the TP:TP phase. As with visual inspection, log c- difference outcomes were inconsistent across subjects. Two subjects had positive log c- differences for response-based analyses (JE109 = .151; NN111 = .005) while the third had a negative log c-difference (SL110 = -.136). Two subjects had negative log c- differences for time-based analyses (SL110 = -.098; NN111 = -.037) while the third had a positive log c-difference (JE109 = -.151). Two subjects had larger log c-differences for response-based analyses than for time-based analyses (JE109 and SL110) while the third had a larger log c-difference for time-based than for response-based analyses (NN111). All subjects had log c-differences close to zero (-.136 ? c (TN:TP) ? c (TP:TP) ? .151). Inferential statistical analysis. For the t-tests of the intercepts against the nil hypothesis, none of the TP:TP regressions for either response-based analyses or time- based analyses was significantly different from zero (i.e., there was no side bias) and none of the TN:TP regressions for either response-based analyses or time-based analyses was significantly different from zero (i.e., there was no bias toward either type of reinforcement). For the tests of the intercepts against each other, the data from the conditions of the TP:TP phase were used as the reference group. This means that a significant positive t-value indicates that the intercept of the TN:TP phase is statistically significantly larger than the intercept of the TP:TP phase (i.e., there is a bias toward ?total feedback? negative over ?total feedback? positive reinforcement) and a significant negative t-value indicates that the intercept of the TN:TP phase is statistically significantly smaller than 103 the intercept of the TP:TP phase (i.e., there is a bias toward? total feedback? positive over ?total feedback? negative reinforcement). These t-tests of the intercepts resulted in no subjects showing a significant difference between the two phases for either the response-based or the time-based analyses. Experiment 3 Results Summary Visual and descriptive analyses showed a difference in slopes for all three subjects for both response-based and time-based analyses. The normative analysis also showed a difference for all three subjects for the response-based analysis. However, MRC analysis showed a difference in slopes for only one of three subjects for the response-based analysis and for only two subjects for the time-based analysis (though it would not be unreasonable, in the context of all analyses, to conclude that NN111?s response-based slopes were different, meaning that two subjects rather than one showed the difference). No systematic bias effects were observed under any analysis. Discussion Visual and descriptive analyses of slopes showed increased behavioral sensitivity to concurrently available ?total feedback? negative versus ?total feedback? positive reinforcement over that of concurrently available ?total feedback? positive versus ?total feedback? positive reinforcement for all subjects for both response-based and time-based analyses. As in Experiment 2, all differences between phase slopes exceeded the normative criterion of .388 corroborating the findings of visual and descriptive analyses. In the context of the conclusions from these analyses, and considering that NN111?s MRC statistical test of slope differences for the response-based analysis approached significance, the evidence converges on the conclusion that at least two of the three 104 subjects demonstrated the slope difference predicted by CDT. Thus, overall slope effects of Experiment 3 were similar to those of Experiments 1 and 2, Magoon and Critchfield (2005), and Critchfield and Lane (2005). The primary conclusions supported by this outcome is that neither the gain-loss difference nor feedback asymmetry is necessary, but both are sufficient, to heighten response-consequence contingency discriminability. This conclusion supports the prediction of CDT that any difference between consequences will increase response-consequence contingency discriminability, but does not add any new information regarding the qualitative difference between positive and negative reinforcement. Small and inconsistent intercept differences indicate no systematic shifts (i.e., bias) toward either type of reinforcement. This finding mirrors the results of Experiment 1, Magoon and Critchfield (2005), and Critchfield and Lane (2005). That the consistent and orderly bias found in Experiment 2 was not reproduced in the current experiment further emphasizes the uniqueness of that phenomenon to that experiment. Experiment 4 When considering the convergent evidence of all methods of analysis, at least 2 of 3 subjects demonstrated the slope differences between phases in Experiment 1, 3 of 3 demonstrated the slope differences in Experiment 2, and at least 2 of 3 demonstrated the slope differences in Experiment 3. The overall conclusion based on this analysis is that neither the gain-loss difference nor feedback asymmetry is necessary, but both are sufficient, to heighten response-consequence contingency discriminability. This outcome is predicted by the CDT interpretation of the differential outcomes effect (Trapold, 1970). At its most fundamental level, the prediction is that any difference between consequences 105 that compete for behavior in a concurrent arrangement will heighten the discriminability of the response-consequence contingencies of the competing responses. The conceptual relationship of this prediction to the outcomes of Experiments 1-3 is that the slopes generally were steeper for ?experimental? functions simply because the consequences of the concurrent responses were different. There is, however, an alternative possibility. The steeper slopes in Experiments 1-3 somehow might have resulted from the mere presence of one of the analogues of negative reinforcement (i.e., ?standard? negative, ?inverse? positive, ?total feedback? negative). To support the former interpretation of the results from Experiments 1-3, a manipulation was required that was able to produce no difference in slopes between phases despite the presence of one of the negative reinforcement analogues in only one of the phases. Experiments 1-3 compared a phase consisting of concurrent different types of reinforcement (X:Y) to a phase consisting of concurrent identical types of reinforcement (X:X). To support the interpretation that the steeper slopes found in the former phases in those experiments were due simply to the different types of consequences of those concurrent responses, two phases consisting of concurrently arranged identical types of reinforcement were compared to each other, but with the identical types within phase differing across phases (X:X to Y:Y). The best arrangement to meet this criterion was to compare phases that used consequences that were as alike as possible, but where, in one phase, the consequences were one of the negative reinforcement analogues. If the consequences within phase are identical to each other, CDT predicts that the matching functions should have similar within-subject slopes. 106 Method Subjects were three college students. In conditions of one phase, VC schedules of ?total feedback? positive reinforcement schedules were programmed for both work areas (TP:TP). In conditions of the other phase, VC schedules of ?total feedback? negative reinforcement were programmed for both work areas (TN:TN). Consequence magnitudes were 7.5?. Table 4 shows experimental arrangements and the sequence of conditions completed by each subject for Experiment 4. Results Figure 4 plots the data for Experiment 4 (see Appendix B for raw data values) as in Experiments 1-3, except, because both phases consisted of concurrently available identical types of reinforcement within phase, no special considerations had to be made when plotting the data. Both response (and time allocation) and reinforcement ratios were calculated with left responses in the numerators. Also as in Experiments 1-3, least- squares regression lines, as determined by Equation 2, were fit to the data of each phase. Open circles and heavy lines represent the data for the TP:TP conditions, and open squares and light lines represent the data for the TN:TN conditions. Table 11 lists the regression equations (from Equation 2) and r 2 values for individual data, separately for response allocation and time allocation, for both TP:TP and TN:TN phases. The proportions of variance accounted for ranged from good (e.g., SS114 TN:TN time r 2 = .939) to excellent (e.g., JL113 TP:TP responses r 2 = .998), with similar fits for both phases (TP:TP = .966 ? r 2 ? .998, M = .983; TN:TN = .939 ? r 2 ? .989, M =.971). Parameter estimates are addressed below. 107 Slopes (Sensitivity) Visual analysis. Visual inspection of response-based graphs in Figure 4 shows a difference in slopes between the TN:TN and TP:TP matching functions for one subject. SS114?s TN:TN response-based slope is steeper than the TP:TP response-based slope, but the time-based slopes are not different. There is no difference between slopes for either of the other subjects for either the response-based or the time-based analyses. Descriptive analysis. The fifth column of Table 11 lists the difference between the TN:TN phase a (slope) coefficient and the TP:TP phase a coefficient for each subject (a (TN:TN) ? a (TP:TP) ). A positive a-difference indicates a steeper a for the TN:TN phase than for the TP:TP phase, and a negative a-difference indicates a shallower a for the TN:TN phase than for the TP:TP phase. As with visual inspection, a-difference outcomes were inconsistent across subjects. Two subjects had positive a-differences for response-based analyses (JL113 = .206; SS114 =.376) while the third had a negative a- difference (MJ112 = -.067). Two subjects had positive a-differences for time-based analyses (JL113 =.116; SS114 = .119) while the third had a negative a-difference (MJ112 = -.049). All subjects had larger a-differences for responses than for time. Two subjects had a-differences close to zero for response-based analyses (MJ112 and JL113) while the third had a much larger a-difference (SS114). All subjects had a-differences close to zero for time-based analyses. Three of six TP:TP slopes (MJ112?s and JL113?slopes for responses and MJ112?s slope for time) and all but one of the TN:TN slopes (SS114?s slope for time) were greater than 1. 108 Normative analysis. According to the normative criterion derived from Robinson's (1992) review of the matching literature, a difference of slopes ? .388 is a meaningful difference. No slope differences in Experiment 4 were larger than .388. Inferential statistical analysis. Table 12 summarizes the results of the MRC analysis. All R 2 values were good (e.g., SS114 M.E. responses R 2 = .926) to excellent (e.g., JL113 Full time R 2 = .993). In Table 12, the ?R 2 Change? value is the change in variance accounted for by the addition of the B 12 interaction parameter and the p value is the probability of getting a difference of that size if there is no actual difference in variance. By this test, none of the subjects showed a statistically significant slope difference for either the responses-based or the time-based analyses. Intercepts (Bias) Visual analysis. Visual inspection of both response-based and time-based graphs in Figure 4 shows no systematic bias for either type of reinforcement. The graphs show no pronounced intercept difference between functions for any subject and all intercepts are close to zero. Descriptive analysis. The last column of Table 11 lists the difference between the TN:TN phase log c (intercept) coefficient and the TP:TP phase log c coefficient for each subject (c (TN:TN) ? c (TP:TP) ). A positive log c-difference indicates a higher log c for the TN:TN phase than for the TP:TP phase, and a negative log c-difference indicates a lower log c for the TN:TN phase than for the TP:TP phase. As with visual inspection, log c- difference outcomes were inconsistent across subjects. Two subjects had negative log c- differences for response-based analyses (JL113 = -.272; SS114 = -.024) while the third 109 had a positive log c-difference (MJ112 = .149). Two subjects had negative log c- differences for time-based analyses (JL113 = -.193; SS114 = -.041) while the third had a positive log c-difference (MJ112 = .128). Two subjects had larger log c-differences for response-based analyses than for time-based analyses (MJ112 and JL113) while the third had a larger log c-difference for the time-based analysis than for the response-based analysis (SS114). All subjects had log c-differences close to zero (-.272 ? c (TN:TN) ? c (TP:TP) ? .149). Inferential statistical analysis. For the t-tests of the intercepts against the nil hypothesis, one subject showed statistically significant biases, though they were to opposite sides for the different phases. JL113 showed a statistically significant left side bias for both response-based and time-based analyses for the regressions of the data from the TP:TP phase (response t = 2.785, p = .050; time t = 2.940, p = .042) and a statistically significant right side bias for both response-based and time-based analyses for the regressions of the data from the TN:TN phase (response t = -2.792, p = .049; time t = - 2.910, p = .044). For the tests of the intercepts against each other, the data from the conditions of the TP:TP phase were used as the reference group. This means that a significant positive t-value indicates that the intercept of the TN:TN phase is statistically significantly larger than the intercept of the TP:TP phase (i.e., there is a larger left side bias for concurrent ?total feedback? negative reinforcement than there is for concurrent ?total feedback? positive reinforcement) and a significant negative t-value indicates that the intercept of the TN:TN phase is statistically significantly smaller than the intercept of the TP:TP phase (i.e., there is a larger left side bias for concurrent ?total feedback? positive 110 reinforcement than there is for concurrent ?total feedback? negative reinforcement). These t-tests of the intercepts of Experiment 4 resulted in one subject showing a statistically significant difference between the two phases. JL113 showed a significantly larger left side bias for the regression of the data from the TP:TP phase over the regression of the data from the TN:TN phase for both response-based and time-based analyses (response t = -3.944, p = .017; time t = -4.137, p = .014). Experiment 4 Results Summary Evidence from visual, normative, and MRC analyses all converge on the results that there were no systematic slope or intercept differences for any subject for either the response-based or the time-based analyses. Descriptive analysis was the only analysis that resulted in slopes and intercepts differences for all three subjects for both response- based and time-based analyses, however, the differences were unsystematic. Discussion Visual, normative, and MRC analyses all converged on the conclusion that there were no slope or intercept differences for any subject. In contrast, descriptive analysis resulted in slope and intercept differences for all three subjects for both response-based and time-based analyses, though not in any systematic way. A contradiction such as this is not surprising whenever success is measured as the absence of an effect. That is because simple comparisons of regression coefficients from two different functions will almost always result in the conclusion that they are different. It is highly unlikely that within subject matching functions would ever result in exactly the same coefficients, even if exactly the same consequences were used for both phases. Given that the differences found from the descriptive analysis were unsystematic, those results may be 111 interpreted as corresponding to the results from the visual, normative, and MRC analyses which may, in total, converge on and support the tentative conclusion from Experiment 3 that neither the gain-loss difference nor feedback asymmetry is necessary, but both are sufficient, to heighten response-consequence contingency discriminability. This conclusion supports the general prediction made by CDT that any difference between consequences will heighten the response-consequence contingency discriminability of concurrent responses making behavior more sensitive to changing relative rates of reinforcement. GENERAL DISCUSSION Results Summary A prediction of CDT (Alsop, 1991; Davison & Jenkins, 1985; Davison & Nevin, 1999; Davison & Tustin, 1978; Nevin, 1981, 2005) is that any qualitative difference between consequences of concurrent responses will increase response-consequence contingency discriminability between those responses and, as a result, will increase behavioral sensitivity to individual response-consequence contingencies. That steeper least-squares regression slopes were found in the present investigation when responding was maintained by concurrently available different consequences than when it was maintained by concurrently available identical consequences supports this prediction. Figure 5 illustrates this by showing the slopes (top panel) and between-phase slope differences (bottom panel) for each subject in the four experiments. The top panel of Figure 5 shows slope values for the data from each phase for each subject. Dark bars represent the slopes of ?baseline? phases and light bars represent the slopes of ?experimental? phases. In Experiments 1-3, ?baseline? phases arranged 112 identical consequences for each component of the concurrent schedules and ?experimental? phases arranged different consequences for each component. Experiment 4 arranged identical consequences within both phases but different consequences across phases. It can be seen in the top panel of Figure 5 that all ?experimental? phase slopes were greater than 1 whereas only three ?baseline? phase slopes were. Of those three, two were in Experiment 4 where both phases used identical consequences within phase. This panel also shows that slopes were consistently higher when phases arranged concurrent different consequences than when they arranged concurrent identical consequences (Experiments 1-3) and were not consistently different when both phases arranged identical consequences (Experiment 4). The lower panel of Figure 5 further highlights this outcome. In the lower panel of Figure 5, bars represent slope differences between regression functions ("experimental" phase slope minus ?baseline? phase slope). In all cases where phases that arranged concurrent different consequences were compared to phases that arranged concurrent identical consequences (Experiments 1-3), the ?experimental? phase slope values were larger, as reflected by the positive slope differences. When identical consequences were used within phase for both phases (Experiment 4), one slope had a negative slope difference and the other two had positive slope differences. The horizontal line in the lower panel of Figure 5 identifies the normative slope-change criterion determined from Robinson?s (1992) data. All but one of the subjects in Experiments 1-3 met this criterion. None of the subjects in Experiment 4 did. 113 In addition to the response-consequence contingency discriminability prediction discussed above, CDT predicts increased behavioral sensitivity (i.e., slope increases) if the stimuli signaling the availability of concurrent responses are distinct (i.e., stimulus- response contingency discriminability). Magoon and Critchfield (2005) and Critchfield and Lane (2005) found slope differences like those of the present investigation using nearly identical procedures but using unsystematically different target-box colors across conditions. Consequently, those experiments could not reasonably determine whether their results were due to stimulus-response contingency discriminability, response- consequence contingency discriminability, or an interaction/combination of the two. The present procedures used identical target-box colors in every condition; therefore, within the context of CDT, all slope differences are interpreted as being solely the product of differential response-consequence contingency discriminability between phases. Another difference between Magoon and Critchfield?s (2005) procedures and those of the present investigation was that their experiment employed money supplements for sessions involving negative reinforcement. Replication of the slope differences in the present investigation without such supplements indicated that they have no differential functional effect on behavior. It was noted earlier that Critchfield and Lane (2005) found much smaller slope differences than Magoon and Critchfield without using supplements. That was not the case in the present investigation. The magnitude of slope differences in the present investigation more closely approximated those found by Magoon and Critchfield than those found by Critchfield and Lane. 114 Implications for Quantitative Model Development of CDT Various quantitative expressions of CDT have been proposed (Alsop, 1991; Davison & Jenkins, 1985; Davison & Nevin, 1999; Davison & Tustin, 1978; Nevin, 1981, 2005), the relative merits of which are still being debated. While the conceptual foundations of CDT are widely accepted by the proponents of all existing models, it seems premature to commit to any particular one. However, the present investigation might provide both a strategy to address the debate generally and results to inform model development efforts specifically. As described previously, CDT predicts that a difference between concurrently available consequences along any dimension (e.g., reinforcement frequency, magnitude, type, delay) will heighten response-consequence contingency discriminability and thereby increase behavioral sensitivity to individual consequences. However, neither CDT nor any quantitative expressions of it specify the manner in which these other dimensions interact with behavior or each other to affect behavioral outcomes. For example, Davison and Nevin (1999) invoked two parameters, d sb and d br , to quantify CDT?s stimulus-response and response-consequence contingency discriminability propositions, respectively. However, they did not specify how choice-controlling variables such as reinforcement frequency, magnitude, type, and delay may (or may not) interact to contribute to d br . They acknowledged that ?[i]f [their] model is to be extended to choice-controlling variables other than frequency of reinforcement, a substantial research effort will be required to identify independent functions for the discriminability and value of the consequences of choice? (Davison & Nevin, 1999, p. 472). The implication was that it is at least in principle possible to specify how these variables 115 interact to contribute to d br . The methods of the present investigation provide a possible strategy by which to approach the problem. Equation 1 provides two measures that characterize deviations from strict matching (Herrnstein, 1961). Log c describes constant response bias (Baum, 1974) and a x describes the degree of effect a given variable (X) has on response allocation (Baum, 1979; Landon et al., 2003). These parameters could be the metrics by which to compare the interactions (or lack thereof) of different choice-controlling variables and their influence on behavior. Some research on magnitude of reinforcement has already taken this approach and serves here to illustrate the general strategy. Landon et al. (2003) arranged a concurrent schedule switching-key procedure in which pigeons' key-pecking responses were reinforced. Relative frequency of reinforcement and overall absolute magnitude of reinforcement were held equal and constant across conditions but relative magnitude was varied. They found that pigeon?s pecking behavior undermatched (i.e., a < 1) relative reinforcer magnitude ratios and did so to a greater degree than found in a different study of the same birds when relative reinforcer frequency ratios were varied (i.e., magnitude a < frequency a) (see Landon, et al., 2003, Table 2 for the relevant comparison). Two of their conclusions were that behavior is less sensitive to reinforcer magnitude than it is to reinforcer frequency and that the effects of these two variables are independent. McLean & Blampied (2001) arranged a two-key concurrent schedules procedure in which pigeons? key-pecking responses were reinforced. Relative frequency of reinforcement was varied across identical ranges for two phases, one in which the concurrently available reinforcers were of identical magnitude, and one in which the 116 concurrently available reinforcers were of different magnitude. In the phase with different magnitudes of reinforcement, there was a pronounced bias for the larger magnitude reinforcers (i.e., log c ? 0 towards larger magnitude reinforcer) while there was no systematic bias in the other phase (i.e., log c = 0). There were no systematic slope (i.e., a) differences between the two phases. While McLean and Blampied?s results corresponded well with Landon et al.?s (2003) conclusion that reinforcement frequency and magnitude effect behavior independently, their results did not agree with Landon et al.?s conclusion that behavior is less sensitive to reinforcer magnitude than it is to reinforcer frequency. Critchfield & Merrill (2005) extended McLean and Blampied?s work to human subjects using procedures modeled closely after those of the present investigation. One of their phases arranged concurrently available schedules of positive reinforcement with equal magnitude reinforcers and the other arranged concurrently available schedules of positive reinforcement with different magnitudes of reinforcement. Their results mirrored those of McLean and Blampied (2001) in that the phase with the different magnitudes of reinforcement showed a clear bias toward the larger magnitude reinforcers, the other phase showed no bias, and there were no slope differences between phases. The three studies just discussed illustrate how within subject comparisons of matching functions can be used to analyze the separate effects of frequency and magnitude of reinforcement on behavior. The results of all three studies converge on the conclusion that magnitude of reinforcement exerts an independent effect from that of reinforcement frequency. Landon et al. (2003) came to this conclusion because nonzero slopes resulted when only relative magnitude of reinforcement was varied. McLean and 117 Blampied (2001) and Critchfield and Merrill (2005) came to the same conclusion when there were intercept but not slope differences between phases when the range of relative reinforcement frequency was held constant across phases but magnitude was the same within one phase and different within the other. In all three cases, the parameters derived from Equation 1 were the metrics by which effects were evaluated. The present investigation adopted this same general strategy to assess independent and interactive effects between frequency and type of reinforcement. The difference between outcomes of Experiments 1-3 and Experiment 4 suggests independence of reinforcer-frequency effects and reinforcer-type effects that is similar to the independence found between reinforcer-frequency effects and reinforcer-magnitude effects just described. However, these results cannot rule out the possible necessity of an interaction between frequency and type-difference to produce the increased slopes found in Experiments 1-3. To test further for frequency and type independence would require an arrangement where frequency was constant and unchanged in all conditions but type was manipulated across conditions. Consider an experiment where the ?baseline? phases consisted of concurrently available schedules of positive reinforcement with the relative reinforcement frequency ratio fixed at 1:1 for all conditions and the relative reinforcement magnitude ratios varied across a range of values (as in Landon et al., 2003), and where the ?experimental? phases consisted of identical schedules as the ?baseline? phases except that reinforcer type (e.g., positive vs. negative) would be different for the two concurrent responses. The findings of Landon et al. (2003) predict a nonzero slope for the ?baseline? function based strictly on the effects of varied relative reinforcer magnitudes. Any increase in slope in the 118 ?experimental? phase would further support the independence of reinforcer frequency and type. However, even given this outcome from such an experiment, it would be premature to assume that the effects resulting from positive and negative reinforcement manipulation would apply to all instances of concurrently available different types of reinforcement. Further research would require experiments with similar methods using other qualitatively different outcomes such as money versus college course extra credit with college students as subjects, or qualitatively different types of food with nonhumans as subjects. Extending this general research strategy to other response-consequence choice-controlling variables and their various combinations could provide the means by which they might be entered into a more fully integrated quantitative expression of CDT. However, before such a research program could proceed, an effort needs to be made to identify a standard means by which to evaluate GMR parameter differences. Evaluating GMR Parameter Differences If a serious effort is to be made toward investigating the possible interactions among choice-controlling variables using procedures that compare matching functions, an equally serious effort will have to be made toward developing a standard, objective method of determining whether matching parameters (e.g., slopes) derived from the same individual in two experimental phases differ significantly from each other. A search of the literature in the experimental analysis of behavior revealed few instances in which fitted parameters of a single quantitative model were compared. Rather, a common strategy has been to evaluate the variance accounted for by a given model (e.g., Baum, 1979; Davison & Nevin, 1999), or to compare variance accounted for by two different models (e.g., Alsop, 1991; Davison, 1991; Davison & Jenkins, 1985; Davison & Tustin, 119 1978). When experimental designs have permitted the determination of fitted parameters for two or more functions, using the same model and subject, no consistency is apparent across investigations in the method of comparing parameter estimates. Some investigators have relied only on nontechnical means such as visual inspection of graphed data or of fitted parameter estimates (e.g., Bron, et al., 2003; Madden & Perone, 1999; McLean & Blampied, 2001; White, Pipe, & McLean, 1984), while others have included applying inferential statistics toward this end (e.g., Landon, et al., 2003; McCarthy & Davison, 1980; McCarthy, Davison, & Jenkins, 1982). The conclusions of the present investigation, both within and across experiments, were based on convergent evidence from all methods of analysis (i.e., exploratory data analysis). However, the results of individual analyses illustrate that different means of comparing parameter estimates can promote different conclusions. This is highlighted by referring to Table 13, which summarizes the results of each analysis of slopes for each experiment of the present investigation. Each cell of Table 13 shows the proportion of subjects who showed a slope increase for ?experimental? phases over that of ?baseline? phases. Individually, the visual and normative analyses support the overall conclusions stated above (though visual analysis may not and may be more akin to descriptive analysis if JL113?s slopes are considered different), but the descriptive and MRC analyses each support a different conclusion. Descriptive analysis per se promotes no conclusion because all but one subject showed a slope increase regardless of independent variable manipulation. MRC analysis per se promotes the conclusion that feedback differences are both necessary and sufficient and gains versus losses are neither necessary nor sufficient to increase behavioral sensitivity because all three subjects showed a slope 120 increase in Experiment 2 and only one showed a slope increase in Experiment 3. While this conclusion would provide interesting information regarding positive and negative reinforcement, it would fail to support CDT. Each method of analysis has its own limitations. Visual inspection may admit a degree of subjectivity (DeProspero & Cohen, 1979; Fisch, 1998) because as effects become less clear, human observers become less reliable in judging them (DeProspero & Cohen, 1979, Jones, Weinrott, & Vaught, 1978; Parsonson & Baer, 1978, 1986, 1992). Descriptive analysis, which in the present investigation means ordinal-scale comparisons of parameter estimates of Equation 2, will always result in the conclusion that slopes are different because it does not take into account the magnitude of the difference ? a difference of .01 is the same as a difference of 1.0 ? and inherent behavioral variability virtually guarantees some difference. The normative criterion derived from Robinson (1992) is based on intersubject variance, which is not functionally identical to intrasubject variance because intersubject variance incorporates not only the variance attributable to the independent variable, but also inherent behavioral error variance and the effects of other study-specific factors that might have had a possible systematic bearing on slopes (e.g., changeover delay duration). MRC analysis, while a valid statistical tool for assessing slope differences between least-squares linear regressions, is limited by the procedures of the present investigation. Specifically, the research tradition underlying the current procedures requires extended observation of each individual in each experimental condition (i.e., many sessions) so that error variance arising from transitory variables (e.g., acquisition) may be minimized and so that the data show what is typical for each individual under 121 each condition (Sidman, 1960). This approach tends to yield strong experimental control but at a high cost, particularly when using humans as subjects, as experiments become lengthy (making subject retention difficult) and expensive (when money is used as reinforcement, an obvious economic constraint is put on the researcher). For practical reasons, then, experimental analyses within this research tradition tend to yield relatively few data points per subject and few data points per subject means that the power of any within-subject inferential-statistical test is likely to be extremely low (Cohen, 1988). Statistical power is broadly defined as the probability that a statistical test will detect an effect when an effect is actually present. The limitation of potentially low statistical power was not an issue in the present investigation when MCR resulted in statistical significance; in fact, reaching significance with a likely low power test supports the conclusion of an extremely strong effect. The issue is only relevant when statistical significance was not reached or ?approached? significance (e.g., Experiment 1: SO104?s ?R 2 Change? for time-based analysis, p = .063; Experiment 4: NN111?s ?R 2 Change? for response-based analysis, p = .061). In those cases, it is unknown whether an effect was not significant because of inadequate effect size or because of inadequate sample size. This is particularly relevant in Experiment 4 where no effects were statistically significant and the absence of an effect was considered a ?success?. It would have been useful to know the power of the tests with the sample sizes used in the present investigation, but power tables provided by Cohen (1988) for MRC analyses provide values only for cases with a minimum of 20 degrees of freedom. In the present experimental design, the available degrees of freedom were 5 for Main Effects models and 4 for Full models. 122 Several strategies to address concerns over statistical power in within subject, parametric investigations are worth considering. One strategy could be to run more conditions per empirical function, but as just discussed, subject retention and cost are formidable problems. A second strategy worth exploring would be to develop procedures that could generate asymptotic responding after less extensive contingency exposure. It should be noted, however, that with existing procedures, humans typically achieve asymptotic response levels much quicker than do nonhumans. For example, stability was achieved in the present investigation within 5 to 23 eight-minute sessions (M = 10.64), or about 40 to 184 minutes of contingency exposure (see Appendix B), and in the McLean and Blampied (2001) study (pigeons) within 20 to 76 forty-minute sessions, or about 800 to 3040 minutes of contingency exposure. It is worth questioning, therefore, how much more efficient the experimental procedures can become. A third strategy worth considering would be to abandon the individual as the primary unit of analysis, and adopt the traditional approach of seeking effects in the aggregated data of many individuals. However, if roughly the same procedures of the present investigation were replicated with more subjects, the subject retention issue might be avoided, but the cost of running an experiment would increase exponentially. To make a groups-design approach more practically tenable, experimental conditions could be abbreviated (e.g., put a strict limit on number of sessions per condition regardless of performance), or each matching function data point could be obtained from a different group, but these approaches would compromise the research tradition?s historical 123 emphasis on asymptotic performance and strong within-subject experimental control, respectively. However they might be obtained, empirical functions incorporating a sufficient number of experimental conditions to meet minimum criteria for power analysis could be analyzed via MRC analysis, the statistical power of those tests could then be determined empirically for each individual (Cohen, 1988), and the power analysis could serve as an index of the confidence to be placed in the statistical test. One can imagine an approach, for instance, in which only subjects achieving sufficient statistical power would be considered in the analysis. Other subjects would be reported but not included in the analysis based on weak experimental control. An alternative to the methods of GMR parameter comparison used in the present investigation and to any of the individual analyses per se would be to obtain normative data on the intrasubject parameter variance. Needed are studies that include within subject matching function replications across many subjects and many species. Given the historical emphasis in the experimental analysis of behavior on replication as a primary tool in increasing experimental control (Sidman, 1960), this would seem the preferred course of action. A Methodological Concern of the Present Investigation The methods employed in the present investigation appear to provide a promising set of tools with the potential to advance a broad research agenda. However, before they can be viewed as a model by which to proceed, a specific procedural issue needs to be addressed. Humans are useful as subjects because they allow the use of verbal stimuli, but they are troublesome as subjects for the same reason. 124 While instructional stimuli are valuable tools that promote acquisition in the absence of explicit shaping (Pilgrim, 1998), they and other verbal stimuli (e.g., ?+?, ???, and the word ?No? in the present investigation) can be problematic because they necessarily rely on subjects? pre-experimental histories. Because these histories may vary across subjects in unknown ways, such stimuli introduce a possible source of intersubject variability. For example, Lane and Cherek (1999) reported that non- reinforced trials, or lost opportunities to earn reinforcement (e.g., ?No +4?), might serve as ?aversive? stimuli and exert substantial differential control over behavior. The present investigation conceptualized a stimulus signaling a loss (i.e., ??4?) as essentially equivalent to a stimulus signaling a lost opportunity to earn (i.e., ?No +4). If Lane and Cherek are correct, there is no way of knowing whether this conceptualization is accurate. As a matter of conjecture, an inequality between the ?aversiveness? of these stimuli may have been responsible for the bias observed in Experiment 2. As a matter of further conjecture, the bias may not have been observed in Experiments 3 and 4, where the ?No +4? stimulus was also used, because of the overall richer feedback context. It would be worth exploring procedures in which experimental repertoires are strictly shaped, and in which nonsense stimuli are established as conditioned consequences, as per procedures used with nonhumans. The latter task would be especially difficult to accomplish, however, because the required conditioning would necessarily involve some stimuli with which subjects have had some previous experience. Nevertheless, the overall goal would be to establish procedures eliminating as many verbal stimuli as possible. 125 Another approach to assessing the potentially troublesome role of verbal stimuli of the present procedures would be to find a means of testing the interspecies generality of the effects. The nonhuman conditioned reinforcement procedure developed by Jackson and Hackenberg (1996) provide such a means. In that procedure, a row of lights is presented above the response keys and individual lights in the row are illuminated as requirements of the schedule of reinforcement in effect are met. After a set time, number of obtained conditioned reinforcers, or responses, a period of exchange is introduced where responses on a different key allow access to food for a brief time and extinguish one of the lights. It seems reasonable that one row of lights could be arranged for each of two response keys associated with one schedule of a two-ply concurrent schedule. One schedule could program ?light extinguishings? according to one variable-cycle schedule and the other schedule could program ?light illuminations? according a different variable- cycle schedule. Responses on the key associated with the former could cancel scheduled ?light extinguishings? and responses on the key associated with the latter could present ?light illuminations.? After appropriate shaping, pigeons could then be exposed to the same series of conditions as in the present investigation. Replication of the present investigation?s results would suggest an insignificant role for verbal stimuli in contributing to those results. Naturally, a similar procedure could be arranged for a variety of species and could be arranged to examine other choice-controlling variables (e.g., different colored lights associated with different magnitudes of reinforcement). Conclusion The present investigation supported CDT?s qualitative prediction that concurrently available different reinforcer types enhance behavioral sensitivity to 126 changing reinforcement rates. CDT is a promising unified theory of both stimulus and consequence control and determining its validity is an important mission. Perhaps just as important, however, the present investigation illustrated a general strategy for mapping interactions among choice-controlling consequence variables. It also raised critical questions about how the results of studies conducted according to this strategy are to be evaluated. Given that theories come and go, the present investigation's most valuable contribution may pertain to these methodological issues. 127 REFERENCES Alsop, B. L. (1991). Behavioral models of signal detection and detection models of choice. In M. L. Commons, J. A. Nevin, & M. C. Davison (Eds.), Signal detection: Mechanisms, models, and applications (pp. 39-55). Hillsdale, NJ: Erlbaum. Baum, W. M. (1974). On two types of deviation from the matching law: Bias and undermatching. Journal of the Experimental Analysis of Behavior, 22, 231-242. Baum, W. M. (1979). Matching, undermatching, and overmatching in studies of choice. Journal of the Experimental Analysis of Behavior, 32, 269-281. Bron, A., Sumpter, C. E., Foster, T. M., & Temple, W. (2003). Contingency discriminability, matching, and bias in the concurrent-schedule responding of possums (Trichosurus Vulpecula). Journal of the Experimental Analysis of Behavior, 289-306. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2 nd ed.). Hillsdale, NJ: Erlbaum. Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3 rd ed.). Mahwah, NJ: Erlbaum. Critchfield, T. S., & Lane, S. D. (2005). Is bad stronger than good? A search for differential impact effects in concurrent schedules, delay discounting, probability discounting, and risky choice. Manuscript submitted for publication. 128 Critchfield, T. S., & Merrill, D. (2005). Human concurrent schedule performance under heterogeneous reinforcement: Testing predictions of matching and contingency discriminability models. Manuscript submitted for publication. Critchfield, T. S., Paletz, E. M., MacAleese, K. R., & Newland, M. C. (2003). Punishment in human choice: Direct or competitive suppression? Journal of the Experimental Analysis of Behavior, 80, 1-27. Davison, M. C. (1991). Stimulus discriminability, contingency discriminability, and complex stimulus control. In M. L. Commons, J. A. Nevin, & M. C. Davison (Eds.), Signal detection: Mechanisms, models, and applications (pp. 57-78). Hillsdale, NJ: Erlbaum. Davison, M., & Jenkins, P. E. (1985). Stimulus discriminability, contingency discriminability, and schedule performance. Animal Learning & Behavior, 13, 77-84. Davison, M., & McCarthy, D. (1988). The matching law: A research review. Hillsdale, NJ: Erlbaum. Davison, M., & Nevin, J. A. (1999). Stimuli, reinforcers, and behavior: An integration. Journal of the Experimental Analysis of Behavior, 71, 439-482. Davison, M. C., & Tustin, R. D. (1978). The relation between the generalized matching law and signal detection theory. Journal of the Experimental Analysis of Behavior, 29, 331-336. DeProspero, A., & Cohen, S. (1979). Inconsistent visual analysis of intrasubject data. Journal of Applied Behavior Analysis, 12, 573-579. 129 deVilliers, P. A. (1972). Reinforcement and response rate interaction in multiple random-interval avoidance schedules. Journal of the Experimental Analysis of Behavior, 18, 499-507. Farley, J. & Fantino, E. (1978). The symmetrical law of effect and the matching relation in choice behavior. Journal of the Experimental Analysis of Behavior, 29, 37-60. Fisch, G. S. (1998). Visual inspection of data revisited: Do the eyes still have it? The Behavior Analyst, 21, 111-123. Fleshler, M., & Hoffman, H. S. (1962). A progression for generating variable-interval schedules. Journal of the Experimental Analysis of Behavior, 5, 529-530. Green, D. M., & Swets, J. A. (1966). Signal-detection theory and psychophysics. New York: Wiley. Herrnstein, R. J. (1961). Relative and absolute strength of response as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior, 4, 267-272. Herrnstein, R. J. (1970). On the law of effect. Journal of the Experimental Analysis of Behavior, 13, 243-266. Hollard, V., & Davison, M. C. (1971). Preference for qualitatively different reinforcers. Journal of the Experimental Analysis of Behavior, 16, 375-380. Jackson, K. J., & Hackenberg, T. D. (1996). Token reinforcement, choice, and self- control in pigeons. Journal of the Experimental Analysis of Behavior, 66, 29-49. Jones, R. R., Weinrott, M., & Vaught, R. S. (1978). Effects of serial dependency on the agreement between visual and statistical inference. Journal of Applied Behavior Analysis, 11, 277-283. 130 Kollins, S. H., Newland, M. C., & Critchfield, T. S. (1997). Human sensitivity to reinforcement in operant choice: How much do consequences matter? Psychonomic Bulletin & Review, 4, 208-220. Lane, S. D., & Cherek, D. R. (1999). Decision under conditions of risk: Exploring some parameters and quantitative models. Experimental Analysis of Human Behavior Bulletin, 17, 15-19. Landon, J., Davison, M., & Elliffe, D. (2003). Concurrent schedules: Reinforcer magnitude effects. Journal of the Experimental Analysis of Behavior, 79, 351- 365. Logue, A. W., & deVilliers, P. A. (1981). Matching of behavior maintained by concurrent shock avoidance and food reinforcement. Behaviour Analysis Letters, 1, 247-258. Madden, G. J., & Perone, M. (1999). Human sensitivity to concurrent schedules of reinforcement: Effects of observing schedule-correlated stimuli. Journal of the Experimental Analysis of Behavior, 71, 303-318. Magoon, M. A., & Critchfield, T. S. (2005). Concurrent-schedules of positive and negative reinforcement: Differential-impact and differential-outcomes effects. Manuscript submitted for publication. McCarthy, D., & Davison, M. (1980). Independence of sensitivity to relative reinforcement rate and discriminability in signal detection. Journal of the Experimental Analysis of Behavior, 34, 273-284. 131 McCarthy, D., Davison, M., & Jenkins, P. E. (1982). Stimulus discriminability in free- operant and discrete-trial detection procedures. Journal of the Experimental Analysis of Behavior, 37, 199-215. McLean, A. P., & Blampied, N. M. (2001). Sensitivity to relative reinforcer rate in concurrent schedules: Independence from relative and absolute reinforcer duration. Journal of the Experimental Analysis of Behavior, 75, 25-42. Miller, H. L. (1976). Matching-based hedonic scaling in the pigeon. Journal of the Experimental Analysis of Behavior, 26, 335-347. Nevin, J. A. (1969). Signal detection theory and operant behavior: A review of David M. Green and John A. Swets? ?Signal-detection theory and psychophysics.? Journal of the Experimental Analysis of Behavior, 12, 475-480. Nevin, J. A. (1981). Psychophysics and reinforcement schedules: An integration. In M. L. Commons & J. A. Nevin (Eds.), Quantitative analyses of behavior: Vol. 1. Discriminative properties of reinforcement schedules (pp. 3-27). Cambridge, MA: Ballinger. Nevin, J.A. (2005, February). Reinforcement, attending, and remembering. Paper presented at the California Association for Behavior Analysis, Dana Point, CA. Parsonson, B. S., & Baer, D. M. (1978). The analysis and presentation of graphic data. In T. R. Kratochwill (Ed.), Single subject research (pp. 101-165). New York: Academic Press. Parsonson, B. S., & Baer, D. M. (1986). The graphic analysis of data. In A Poling & R. W. Fuqua (Eds.), Research methods in applied behavior analysis: Issues and advances (pp. 157-186). New York: Plenum. 132 Parsonson, B. S., & Baer, D. M. (1992). The visual analysis of data, and current research into the stimuli controlling it. In T.R. Kratochwill & J. R. Levin (Eds.), Single- case research design and analysis (pp. 15-41). Hillsdale, NJ: Erlbaum. Pierce, W. D., & Epling, W. F. (1983). Choice, matching, and human behavior: A review of the literature. The Behavior Analyst, 6, 57-76. Pilgrim, C. (1998). The human subject. In K. A. Lattal and M. Perone (Eds.), Handbook of Research Methods in Human Operant Behavior (pp. 15-44). New York: Plenum. Redmon, W. K., & Lockwood, K. (1986). The matching law and organizational behavior. Journal of Organizational Behavior Management, 8, 57-72. Robinson, J. K. (1992). Quantitative analyses of choice in rat and pigeon. The Psychological Record, 42, 437-445. Ruddle, H. V., Bradshaw, C. M., Szabadi, E., & Foster, T. M. (1982). Performance of humans in concurrent avoidance/positive-reinforcement schedules. Journal of the Experimental Analysis of Behavior, 38, 51-61. Sidman, M. (1960). Tactics of scientific research: Evaluating experimental data in psychology. New York: Basic Books. Takahashi, M., & Iwamoto, T. (1986). Human concurrent performances: The effects of experience, instructions, and schedule correlated stimuli. Journal of the Experimental Analysis of Behavior, 45, 257-267. Todorov, J. C., Oliveira Castro, J. M., Hanna, E. S., Bittencourt de Sa, M. C. N., & Barreto, M. Q. (1983). Choice, experience, and the generalized matching law. Journal of the Experimental Analysis of Behavior, 40, 99-111. 133 Trapold, M. A. (1970). Are expectancies based upon different positive reinforcing events discriminably different? Learning and Motivation, 1, 129-140. Wearden, J. H., & Burgess, I. S. (1982). Matching since Baum (1979). Journal of the Experimental Analysis of Behavior, 38, 339-348. White, K. G., Pipe, M-E., & McLean, A. P. (1984). Stimulus and reinforcer relativity in multiple schedules: Local and dimensional effects on sensitivity to reinforcement. Journal of the Experimental Analysis of Behavior, 41, 69-81. 134 Footnotes 1 The term ?negative reinforcement? is typically defined as the increase or maintenance of responding by either the removal or avoidance of some stimulus. By this definition, the term can refer to two procedures; one wherein on ongoing stimulus is terminated contingent on the response (escape), and a second wherein a stimulus that has previously been presented in the absence of responding is avoided contingent on the response (avoidance). The more general term ?negative reinforcement? will be used throughout the paper to refer to the specific case of avoidance rather than escape. 2 The strategy of holding overall earnings as constant as possible across experiments was taken for two reasons. First, differential earnings rates might have differentially affected outcomes in some unforeseen way. Second, laboratory lore informs us that if earning rates are too low, subject retention becomes difficult, and if they are too high, experimental control becomes challenging. 135 Table 1 Experimental Conditions ? Experiment 1 Schedules (L : R) Subject Condition Type Ratio Mean Intervals (s) Background Color a TB101 1 P : P 5 : 1 12 : 60 0, 0, 63 2 P : P 1 : 2 30 : 15 0, 63, 0 3 N : P 5 : 1 12 : 60 63, 0, 63 4 P : N 1 : 2 30 : 15 63, 63, 0 5 P : P 1 : 5 60 : 12 0, 63, 63 6 P : N 5 : 1 12 : 60 63, 20, 63 7 N : P 1 : 2 30 : 15 63, 63, 20 8 P : P 2 : 1 15 : 30 20, 20, 63 TM102 1 P : P 1 : 5 60 : 12 0, 0, 63 2 P : P 2 : 1 15 : 30 0, 63, 0 3 N : P 2 : 1 15 : 30 63, 0, 63 4 N : P 1 : 5 60 : 12 63, 63, 0 5 P : P 5 : 1 12 : 60 0, 63, 63 6 P : N 2 : 1 15 : 30 63, 20, 63 7 P : N 1 : 5 60 : 12 63, 63, 20 8 P : P 1 : 2 30 : 15 20, 20, 63 (Table Continues) 136 Table 1 (con?t) Schedules (L : R) Subject Condition Type Ratio Mean Intervals (s) Background Color a SO104 1 P : P 1 : 5 60 : 12 0, 0, 63 2 P : P 2 : 1 15 : 30 0, 63, 0 3 N : P 2 : 1 15 : 30 63, 0, 63 4 N : P 1 : 5 60 : 12 63, 63, 0 5 P : P 5 : 1 12 : 60 0, 63, 63 6 P : N 2 : 1 15 : 30 63, 20, 63 7 P : N 1 : 5 60 : 12 63, 63, 20 8 P : P 1 : 2 30 : 15 20, 20, 63 a Background colors are represented as ?RGB? numbers with each number reflecting the saturation of red, green, and blue hues respectively. Each number could be varied from a minimum of zero to a maximum of 63. Thus, for example, ?0, 0, 63? has no red or green hue and the maximum blue hue resulting in ?pure? blue. 137 Table 2 Experimental Conditions ? Experiment 2 Schedules (L : R) Subject Condition Type Ratio Mean Intervals (s) Background Color a GE105 1 P : P 5 : 1 12 : 60 0, 0, 63 2 IP : P 5 : 1 12 : 60 0, 63, 0 3 P : IP 1 : 2 30 : 15 63, 0, 63 4 P : P 2 : 1 15 : 30 63, 63, 0 5 IP : P 1 : 2 30 : 15 0, 63, 63 6 P : P 1 : 5 60 : 12 63, 20, 63 7 P : IP 5 : 1 12 : 60 63, 63, 20 8 P : P 1 : 2 30 : 15 20, 20, 63 MJ106 1 P : IP 2 : 1 15 : 30 0, 0, 63 2 P : P 1 : 5 60 : 12 0, 63, 0 3 IP : P 2 : 1 15 : 30 63, 0, 63 4 P : P 5 : 1 12 : 60 63, 63, 0 5 P : P 1 : 2 30 : 15 0, 63, 63 6 P : IP 1 : 5 60 : 12 63, 20, 63 7 IP : P 1 : 5 60 : 12 63, 63, 20 8 P : P 2 : 1 15 : 30 20, 20, 63 (Table Continues) 138 Table 2 (con?t) Schedules (L : R) Subject Condition Type Ratio Mean Intervals (s) Background Color a GR108 1 P : P 1 : 2 30 : 15 0, 0, 63 2 P : IP 5 : 1 12 : 60 0, 63, 0 3 IP : P 1 : 2 30 : 15 63, 0, 63 4 P : P 2 : 1 15 : 30 63, 63, 0 5 P : P 5 : 1 12 : 60 0, 63, 63 6 P : IP 1 : 2 30 : 15 63, 20, 63 7 IP : P 5 : 1 12 : 60 63, 63, 20 8 P : P 1 : 5 60 : 12 20, 20, 63 a Background colors are represented as ?RGB? numbers with each number reflecting the saturation of red, green, and blue hues respectively. Each number could be varied from a minimum of zero to a maximum of 63. Thus, for example, ?0, 0, 63? has no red or green hue and the maximum blue hue resulting in ?pure? blue. 139 Table 3 Experimental Conditions ? Experiment 3 Schedules (L : R) Subject Condition Type Ratio Mean Intervals (s) Background Color a JE109 1 TP : TP 5 : 1 12 : 60 0, 0, 63 2 TN : TP 5 : 1 12 : 60 0, 63, 0 3 TP : TN 1 : 2 30 : 15 63, 0, 63 4 TP : TP 2 : 1 15 : 30 63, 63, 0 5 TN : TP 1 : 2 30 : 15 0, 63, 63 6 TP : TP 1 : 5 60 : 12 63, 20, 63 7 TP : TN 5 : 1 12 : 60 63, 63, 20 8 TP : TP 1 : 2 30 : 15 20, 20, 63 SL110 1 TP : TN 2 : 1 15 : 30 0, 0, 63 2 TP : TP 1 : 5 60 : 12 0, 63, 0 3 TN : TP 2 : 1 15 : 30 63, 0, 63 4 TP : TP 5 : 1 12 : 60 63, 63, 0 5 TP : TP 1 : 2 30 : 15 0, 63, 63 6 TP : TN 1 : 5 60 : 12 63, 20, 63 7 TN : TP 1 : 5 60 : 12 63, 63, 20 8 TP : TP 2 : 1 15 : 30 20, 20, 63 (Table Continues) 140 Table 3 (con?t) Schedules (L : R) Subject Condition Type Ratio Mean Intervals (s) Background Color a NN111 1 TP : TP 1 : 2 30 : 15 0, 0, 63 2 TP : TN 5 : 1 12 : 60 0, 63, 0 3 TN : TP 1 : 2 30 : 15 63, 0, 63 4 TP : TP 2 : 1 15 : 30 63, 63, 0 5 TP : TP 5 : 1 12 : 60 0, 63, 63 6 TP : TN 1 : 2 30 : 15 63, 20, 63 7 TN : TP 5 : 1 12 : 60 63, 63, 20 8 TP : TP 1 : 5 60 : 12 20, 20, 63 a Background colors are represented as ?RGB? numbers with each number reflecting the saturation of red, green, and blue hues respectively. Each number could be varied from a minimum of zero to a maximum of 63. Thus, for example, ?0, 0, 63? has no red or green hue and the maximum blue hue resulting in ?pure? blue. 141 Table 4 Experimental Conditions ? Experiment 4 Schedules (L : R) Subject Condition Type Ratio Mean Intervals (s) Background Color a MJ112 1 TP : TP 5 : 1 12 : 60 0, 0, 63 2 TN : TN 1 : 2 30 : 15 0, 63, 0 3 TP : TP 1 : 2 30 : 15 63, 0, 63 4 TN : TN 5 : 1 12 : 60 63, 63, 0 5 TP : TP 2 : 1 15 : 30 0, 63, 63 6 TP : TP 1 : 5 60 : 12 63, 20, 63 7 TN : TN 2 : 1 15 : 30 63, 63, 20 8 TN : TN 1 : 5 60 : 12 20, 20, 63 JL113 1 TP : TP 1 : 2 30 : 15 0, 0, 63 2 TN : TN 5 : 1 12 : 60 0, 63, 0 3 TN : TN 2 : 1 15 : 30 63, 0, 63 4 TP : TP 1 : 5 60 : 12 63, 63, 0 5 TP : TP 5 : 1 12 : 60 0, 63, 63 6 TN : TN 1 : 2 30 : 15 63, 20, 63 7 TP : TP 2 : 1 15 : 30 63, 63, 20 8 TN : TN 1 : 5 60 : 12 20, 20, 63 (Table Continues) 142 Table 4 (con?t) Schedules (L : R) Subject Condition Type Ratio Mean Intervals (s) Background Color a SS114 1 TP : TP 1 : 5 60 : 12 0, 0, 63 2 TP : TP 2 : 1 15 : 30 0, 63, 0 3 TN : TN 5 : 1 12 : 60 63, 0, 63 4 TP : TP 1 : 2 30 : 15 63, 63, 0 5 TN : TN 1 : 2 30 : 15 0, 63, 63 6 TN : TN 1 : 5 60 : 12 63, 20, 63 7 TP : TP 5 : 1 12 : 60 63, 63, 20 8 TN : TN 2 : 1 15 : 30 20, 20, 63 a Background colors are represented as ?RGB? numbers with each number reflecting the saturation of red, green, and blue hues respectively. Each number could be varied from a minimum of zero to a maximum of 63. Thus, for example, ?0, 0, 63? has no red or green hue and the maximum blue hue resulting in ?pure? blue. 143 Table 5 Experiment 1 Independent Regressions Subject Condition Equation r 2 a (N:P) - a (P:P) c (N:P) - c (P:P) Responses TB101 P:P y = 0.806x ? .116 .892 .296 .158 N:P y = 1.102x + .042 .950 TM102 P:P y = 0.547x + .043 .998 .722 -.140 N:P y = 1.269x ? .097 .979 SO104 P:P y = 0.600x ? .058 .870 .632 .588 N:P y = 1.232x + .530 .974 Time TB101 P:P y = 0.737x ? .111 .893 .264 .136 N:P y = 1.001x + .025 .960 TM102 P:P y = 0.478x + .031 .996 .596 -.171 N:P y = 1.074x ? .140 .987 SO104 P:P y = 0.464x ? .047 .865 .429 .461 N:P y = 0.893x + .414 .973 144 Table 6 Experiment 1 MRC Regressions Subject Model Equation R 2 R 2 Change p Responses TB101 M.E. y = 0.971x 1 + 0.140x 2 ? 0.107 .912 Full y = 0.806x 1 + 0.157x 2 + 0.295x 12 ? 0.116 .933 .021 .330 TM102 M.E. y = 0.969x 1 ? 0.167x 2 + 0.035 .872 Full y = 0.547x 1 ? 0.140x 2 + 0.722x 12 + 0.043 .982 .110 .007 SO104 M.E. y = 0.986x 1 + 0.638x 2 ? 0.050 .911 Full y = 0.600x 1 + 0.587x 2 + 0.633x 12 ? 0.058 .971 .060 .044 Time TB101 M.E. y = 0.884x 1 + 0.121x 2 ? 0.103 .920 Full y = 0.737x 1 + 0.136x 2 + 0.264x 12 ? 0.111 .940 .020 .313 TM102 M.E. y = 0.826x 1 ? 0.194x 2 + 0.025 .887 Full y = 0.478x 1 ? 0.172x 2 + 0.595x 12 + 0.031 .989 .102 .003 SO104 M.E. y = 0.727x 1 + 0.495x 2 ? 0.041 .919 Full y = 0.464x 1 + 0.460x 2 + 0.429x 12 ? 0.047 .969 .050 .063 145 Table 7 Experiment 2 Independent Regressions Subject Condition Equation r 2 a (IP:P) - a (P:P) c (IP:P) - c (P:P) Responses GE105 P:P y = 0.228x + .020 .947 .920 .614 IP:P y = 1.148x + .634 .990 MJ106 P:P y = 0.442x ? .044 .973 .931 .357 IP:P y = 1.373x + .313 .963 GR108 P:P y = 0.312x + .006 .987 .761 .255 IP:P y = 1.073x + .261 .915 Time GE105 P:P y = 0.156x + .014 .864 .582 .425 IP:P y = 0.738x + .439 .996 MJ106 P:P y = 0.376x ? .037 .981 .729 .226 IP:P y = 1.105x + .189 .952 GR108 P:P y = 0.278x + .003 .986 .572 .181 IP:P y = 0.850x + .184 .936 146 Table 8 Experiment 2 MRC Regressions Subject Model Equation R 2 R 2 Change p Responses GE105 M.E. y = 0.757x 1 + 0.667x 2 + 0.023 .848 Full y = 0.228x 1 + 0.614x 2 + 0.920x 12 + 0.020 .994 .146 .001 MJ106 M.E. y = 1.050x 1 + 0.345x 2 ? 0.023 .832 Full y = 0.442x 1 + 0.357x 2 + 0.931x 12 ? 0.044 .966 .134 .016 GR108 M.E. y = 0.766x 1 + 0.276x 2 ? 0.006 .762 Full y = 0.312x 1 + 0.254x 2 + 0.761x 12 + 0.006 .925 .163 .042 Time GE105 M.E. y = 0.491x 1 + 0.458x 2 + 0.016 .862 Full y = 0.156x 1 + 0.425x 2 + 0.582x 12 + 0.014 .995 .133 .000 MJ106 M.E. y = 0.852x 1 + 0.217x 2 ? 0.020 .828 Full y = 0.376x 1 + 0.226x 2 + 0.729x 12 ? 0.037 .956 .129 .026 GR108 M.E. y = 0.620x 1 + 0.197x 2 ? 0.006 .793 Full y = 0.278x 1 + 0.181x 2 + 0.571x 12 + 0.003 .943 .150 .032 147 Table 9 Experiment 3 Independent Regressions Subject Condition Equation r 2 a (TN:TP) - a (TP:TP) c (TN:TP) - c (TP:TP) Responses JE109 TP:TP y = 0.484x ? .077 .932 .655 .151 TN:TP y = 1.139x + .074 .989 SL110 TP:TP y = 1.081x + .015 .996 .433 -.136 TN:TP y = 1.514x ? .121 .916 NN111 TP:TP y = 0.839x + .106 .937 .734 .005 TN:TP y = 1.573x + .111 .960 Time JE109 TP:TP y = 0.416x ? .064 .914 .552 .104 TN:TP y = 0.968x + .040 .992 SL110 TP:TP y = 0.903x + .014 .998 .263 -.098 TN:TP y = 1.166x ? .084 .918 NN111 TP:TP y = 0.794x + .099 .939 .631 -.037 TN:TP y = 1.425x + .062 .980 148 Table 10 Experiment 3 MRC Regressions Subject Model Equation R 2 R 2 Change p Responses JE109 M.E. y = 0.843x 1 + 0.153x 2 ? 0.069 .859 Full y = 0.484x 1 + 0.151x 2 + 0.655x 12 ? 0.077 .982 .123 .006 SL110 M.E. y = 1.299x 1 ? 0.144x 2 + 0.011 .917 Full y = 1.081x 1 ? 0.136x 2 + 0.433x 12 + 0.015 .942 .025 .260 NN111 M.E. y = 1.254x 1 + 0.005x 2 + 0.114 .881 Full y = 0.839x 1 + 0.006x 2 + 0.734x 12 + 0.106 .956 .074 .061 Time JE109 M.E. y = 0.719x 1 + 0.105x 2 ? 0.057 .860 Full y = 0.416x 1 + 0.104x 2 + 0.552x 12 ? 0.064 .981 .122 .007 SL110 M.E. y = 1.036x 1 ? 0.102x 2 + 0.011 .932 Full y = 0.903x 1 ? 0.097x 2 + 0.263x 12 + 0.014 .947 .015 .351 NN111 M.E. y = 1.151x 1 ? 0.038x 2 + 0.106 .905 Full y = 0.794x 1 ? 0.037x 2 + 0.631x 12 + 0.099 .972 .067 .036 149 Table 11 Experiment 4 Independent Regressions Subject Condition Equation r 2 a (TN:TN) - a (TP:TP) c (TN:TN) - c (TP:TP) Responses MJ112 TP:TP y = 1.383x ? .055 .966 -.067 .149 TN:TN y = 1.316x + .094 .985 JL113 TP:TP y = 1.093x + .136 .998 .206 -.272 TN:TN y = 1.299x ? .136 .984 SS114 TP:TP y = 0.822x + .069 .988 .376 -.024 TN:TN y = 1.198x + .045 .945 Time MJ112 TP:TP y = 1.287x ? .047 .969 -.049 .128 TN:TN y = 1.238x + .081 .985 JL113 TP:TP y = 0.929x + .097 .998 .116 -.193 TN:TN y = 1.045x ? .096 .989 SS114 TP:TP y = 0.665x + .042 .979 .119 -.041 TN:TN y = 0.784x + .001 .939 150 Table 12 Experiment 4 MRC Regressions Subject Model Equation R 2 R 2 Change p Responses MJ112 M.E. y = 1.352x 1 + 0.147x 2 ? 0.054 .974 Full y = 1.383x 1 + 0.148x 2 ? 0.067x 12 ? 0.055 .975 .001 .776 JL113 M.E. y = 1.192x 1 ? 0.270x 2 + 0.134 .983 Full y = 1.093x 1 ? 0.272x 2 + 0.207x 12 + 0.136 .990 .007 .164 SS114 M.E. y = 1.012x 1 ? 0.021x 2 + 0.067 .926 Full y = 0.822x 1 ? 0.023x 2 + 0.376x 12 + 0.069 .958 .032 .154 Time MJ112 M.E. y = 1.263x 1 + 0.127x 2 ? 0.047 .976 Full y = 1.287x 1 + 0.128x 2 ? 0.049x 12 ? 0.047 .976 .000 .816 JL113 M.E. y = 0.985x 1 ? 0.192x 2 + 0.096 .990 Full y = 0.929x 1 ? 0.193x 2 + 0.115x 12 + 0.097 .993 .003 .234 SS114 M.E. y = 0.725x 1 ? 0.040x 2 + 0.042 .949 Full y = 0.665x 1 ? 0.041x 2 + 0.119x 12 + 0.042 .955 .006 .492 151 Table 13 Proportion of Subjects Showing a Steeper Slope for ?Experimental? Phases than for ?Baseline? Phases in Each Experiment by Response-Based Analyses Experiment Visual Descriptive Normative MRC 1 (N:P) 3/3 3/3 2/3 2/3 2 (IP:P) 3/3 3/3 3/3 3/3 3 (TN:TP) 3/3 3/3 3/3 1/3 4 (TN:TN) 1/3 2/3 0/3 0/3 152 Figure Captions Figure 1. The logarithms of the response (left panels) and time (right panels) allocation ratios plotted against the logarithms of the obtained reinforcement ratios for each subject of Experiment 1. Lines were fit to the data from each phase by the method of least squares. Closed circles and heavy lines represent the data from the concurrent VC schedules of ?standard? positive versus ?standard? positive reinforcement and closed triangles and light lines represent the data from the concurrent VC schedules of ?standard? negative versus ?standard? positive reinforcement. Note the different scales on the axes of SO104?s plot for responses. Figure 2. The logarithms of the response (left panels) and time (right panels) allocation ratios plotted against the logarithms of the obtained reinforcement ratios for each subject of Experiment 2. Lines were fit to the data from each phase by the method of least squares. Closed circles and heavy lines represent the data from the concurrent VC schedules of ?standard? positive versus ?standard? positive reinforcement and closed diamonds and light lines represent the data from the concurrent VC schedules of ?inverse? positive versus ?standard? positive reinforcement. Figure 3. The logarithms of the response (left panels) and time (right panels) allocation ratios plotted against the logarithms of the obtained reinforcement ratios for each subject of Experiment 3. Lines were fit to the data from each phase by the method of least squares. Open circles and heavy lines represent the data from the concurrent VC schedules of ?total feedback? positive versus ?total feedback? positive reinforcement and closed squares and light lines represent the data from the concurrent VC schedules of ?total feedback? negative versus ?total feedback? positive reinforcement. 153 Figure 4. The logarithms of the response (left panels) and time (right panels) allocation ratios plotted against the logarithms of the obtained reinforcement ratios for each subject of Experiment 4. Lines were fit to the data from each phase by the method of least squares. Open circles and heavy lines represent the data from the concurrent VC schedules of ?total feedback? positive versus ?total feedback? positive reinforcement and open squares and light lines represent the data from the concurrent VC schedules of ?total feedback? negative versus ?total feedback? negative reinforcement. Figure 5. Slope estimates (top panel) and slope estimate differences (bottom panel) of each matching function for each subject of each experiment. In the top panel, dark bars represent the slope estimates for the ?baseline? phase matching functions and light bars represent the slope estimates for the ?experimental? phase matching functions. In the bottom panel, bars represent the difference between the ?experimental? matching function slope estimate and the ?baseline? matching function slope estimate. The horizontal line in the bottom panel demarcates the normative criterion (a = .388) used as the primary evaluative tool for the present investigation. Figure 1 Responses Time -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 154 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 -2 -1 0 1 2 -2 -1 0 1 2 % Conc N:P ? Conc P:P TB101 TM102 SO104 Lo g Allocation Ratio ( N:P = N/P; P:P = L/R ) Log Reinforcement Ratio (N:P = N/P; P:P = L/R) Figure 2 Responses Time -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 155 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 " Conc IP:P ? Conc P:P GE105 MJ106 GR108 Lo g Allocation Ratio ( IP:P = IP/P; P:P = L/R ) Log Reinforcement Ratio (IP:P = IP/P; P:P = L/R) Figure 3 Responses Time -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 156 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 ! Conc TN ? Conc :TP TP:TP JE109 SL110 NN111 Lo g Allocation Ratio ( TN:TP = TN/TP; TP:TP = L/R ) Log Reinforcement Ratio (TN:TP = TN/TP; P:P = L/R) Figure 4 Responses Time -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 157 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2 ' Conc TN ? Conc :TN TP:TP MJ112 JL113 SS114 Lo g Allocation Ratio ( L/R ) Log Reinforcement Ratio (L/R) Figure 5 -0.5 0.0 0.5 1.0 1.5 2.0 TB101 TM102 SO104 GE105 MJ106 GR108 JE109 SL110 NN111 MJ112 Exp. 1 (N:P) Exp. 2 (IP:P) Exp. 3 (TN:TP) E Sl o p e Baseline Experimental -0.5 0.0 0.5 1.0 1.5 2.0 TB101 TM102 SO104 GE105 MJ106 GR108 JE109 SL110 NN111 MJ112 Exp. 1 (N:P) Exp. 2 (IP:P) Exp. 3 (TN:TP) E S l opes Di f f er enc e Series1 158 JL113 SS114 xp. 4 (TN:TN) JL113 SS114 xp. 4 (TN:TN) a (EXP) -a (BL) = .388 159 Appendix A Intake, Instructions, and Training The first visit to the laboratory began with presentation of the informed consent agreement. Volunteers were asked to read, but not sign it right away, so that they could have a chance to see the task before committing to participate. After they had read the informed consent, and had any questions answered, they were shown the workroom and the particular workstation at which they would be working. Following this brief spatial orientation, they were presented the following set of general task instructions: Instructions to the Subject 1) You cannot take a break during a working session. The total time you spend in the lab will be divided into short working sessions. In between these sessions, you will have the option to take a break, go to the restroom, or ask questions. If you do not need a break and would like to continue working then you should just remain in the ?run room.? You should realize that money cannot be earned while you are on a break; therefore it is in your best interests to take few, if any, breaks. 2) Your task will be to click on the moving boxes using the mouse pointer and the left button on the mouse. After you click the button on the first screen you see, your ?work? screen will appear. On this ?work? screen, you will notice boxes moving around on two different sides of the screen. Your job is to use the mouse to click on these moving boxes. Once you start working on one side, the other side of the screen will turn black. This means that your mouse is turned on for the white 160 side, and turned off for the black side. Either side can be black or white, depending on where you are working at the moment. Clicking on the white side will help you earn money, or avoid losing money. Clicking on the black side will not help you, because the mouse is turned off for that side. While a side is black, you cannot earn money that may be available there. Similarly, while a side is black, you cannot avoid money losses that happen there. To change a side from black to white, you will need to click several times on the small ?Change? box in the middle of your screen. When you have clicked this box enough times, the black side will turn white, and the mouse will be turned on there. The mouse will be off for the other side. Thus, the mouse is always on for only one side at a time. You can change sides whenever you want by clicking on the ?Change? box. There are no rules for how you should respond except that you want to maximize the number of points you get as quickly as you can. 3) The experimenters will keep a running tally of your total money earned. At the end of each session, a message will be displayed informing you of how much money you have accrued for that session. The experimenter will record this amount. At the end of each session, the experimenter will add the amount from that session to your previous cumulative total to get a running total. It is based on this cumulative amount that you will be paid. 4) Please do not hesitate to ask the experimenter any questions you may have. 161 They were then given a 1 min task demonstration with concurrent VC5 schedules of positive reinforcement (reinforcer magnitude = 1?) programmed for both work areas. Language used to address any questions they had did not deviate significantly from the written instructions. If the volunteers were interested in participating in the experiment following the demonstration, they were asked to sign the informed consent. Two 1 min training sessions followed agreement to participate. The training sessions provided free-operant forced-choice arrangements where the ?Change? box was inoperative and only one side had a schedule programmed. Opposite sides were used for each training session and the first side used was varied across subjects. All sessions were programmed with VC5 schedules and reinforcer magnitudes of 1? and subjects were informed that the ?money? they would earn during the training sessions would not count toward their final earnings total. Different instructions had to be given for each training session for each experiment (except Experiments 3 and 4 required the same instructions) to accord with the ?gain vs. loss? and ?feedback asymmetry? variations of the different conditions and were as follows: For Experiment 1, training session 1: Sometimes clicking your mouse on a box will earn you money. If you should earn money, a flashing message will show this on your screen. The upcoming session provides an example with only one side of the screen active. For Experiment 1, training session 2: Sometimes clicking your mouse on a box will keep you from losing money, but there will be no message to show you that you have avoided losing money. 162 Sometimes a message will show you that you have lost money. The upcoming session provides an example with only one side of the screen active. For Experiment 2, training session 1: Sometimes clicking your mouse on a box will earn you money. If you should earn money, a flashing message will show this on your screen. The upcoming session provides an example with only one side of the screen active. For Experiment 2, training session 2: Sometimes clicking your mouse on a box will earn you money, but there will be no message to show you that you have earned money. Sometimes a message will show you that you have missed a chance to earn money. The upcoming session provides an example with only one side of the screen active. Since Experiments 3 and 4 used the same contingency structures (just no differential outcome in Experiment 4), the same instructions were given for both. For Experiments 3 and 4, training session 1: Sometimes clicking your mouse on a box will earn you money. If you should earn money, a flashing message will show this on your screen. Sometimes a message will show you that you have missed a chance to earn money. The upcoming session provides an example with only one side of the screen active. For Experiments 3 and 4, training session 2: Sometimes clicking your mouse on a box will keep you from losing money. If you should avoid losing money, a flashing message will show this on your screen. Sometimes a message will show you that you have lost money. The upcoming session provides an example with only one side of the screen active. 163 Each training session was designed to give the subject experience with each of the four types of consequences that would be presented in the experiment (two types of feedback for each of the two types of consequences) and the experimenter prompted the subject in whatever way was necessary to ensure this. For example, if the subject was training on a VC5 schedule of positive reinforcement, and was clicking the mouse rapidly enough to earn every positive reinforcer, the experimenter prompted the subject not to respond for a few moments just to experience the feedback (or lack thereof) for not responding. After the second training session was completed, subjects were given the following final set of instructions prior to beginning the experiment proper: From now on, both sides of your screen will be active. Note that clicking the two boxes may affect your earnings differently. You can respond as much or as little as you like wherever you like and use whatever strategy you like overall. It is up to you to figure out how to work each side to your best advantage. After the subject read the final set of instructions, all instructions were removed, the background noise began (their own music in most cases), the door to the workroom was shut, and the experiment began. Appendix B Raw Data Averaged from the Four-Session Stable Data Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability 1 TB101 P5:P1 316.50 136.50 328.76 150.45 31.25 6.25 7.25 1.50 6 P1:P2 187.25 267.25 200.14 278.36 9.75 23.00 5.75 10.50 7 1 6 4 N5:P1 379.50 31.75 428.96 49.39 37.25 6.00 2.75 1.75 10 P1:N2 166.25 199.25 214.46 263.84 13.25 25.50 2.50 6.25 15 P1:P5 42.50 400.00 54.17 424.68 5.25 36.00 2.25 3.00 10 P5:N1 415.25 40.50 429.67 48.39 37.25 4.00 2.25 3.50 7 N1:P2 155.50 278.25 176.45 301.81 11.50 26.25 4.00 6.00 15 P2:P1 211.50 158.25 264.98 213.67 26.00 13.25 6.75 2.75 9 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability TM102 P1:P5 387.75 774.00 170.05 309.29 7.00 31.25 1.00 7.50 5 P2:P1 742.50 483.50 282.99 196.04 26.75 13.75 5.00 2.25 9 1 6 5 N2:P1 298.75 286.50 239.79 239.51 26.00 14.75 6.75 1.25 6 N1:P5 19.75 391.75 29.42 449.67 4.25 38.50 3.25 .50 9 P5:P1 309.25 103.25 345.04 133.91 33.75 5.75 6.00 2.00 11 P2:N1 362.50 79.50 381.44 97.38 30.75 10.00 2.00 5.75 8 P1:N5 97.50 874.75 78.21 401.00 7.00 37.50 1.00 2.00 11 P1:P2 416.50 585.50 199.36 279.85 10.75 23.00 5.25 9.75 6 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability SO104 P1:P5 82.00 382.50 111.47 367.56 5.50 34.25 2.25 4.00 17 P2:P1 439.75 254.00 293.21 185.56 28.00 13.25 6.00 3.00 15 1 6 6 N2:P1 607.75 27.00 442.55 35.23 30.75 6.50 2.00 9.00 5 N1:P5 270.50 298.50 233.25 245.71 7.50 30.75 .50 8.50 22 P5:P1 417.25 242.25 285.92 192.96 32.75 6.50 6.00 1.50 6 P2:N1 257.75 231.75 241.66 237.46 26.25 13.50 5.50 2.75 14 P1:N5 9.25 997.50 16.78 462.01 2.50 38.00 4.50 1.50 16 P1:P2 315.25 382.00 218.61 260.34 13.75 28.25 2.75 4.50 5 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability 2 GE105 P5:P1 292.00 194.50 274.83 203.27 28.25 7.50 9.75 .50 12 IP5:P1 831.75 31.75 430.30 47.93 36.25 6.75 2.50 .75 13 1 6 7 P1:IP2 33.75 716.50 57.63 421.11 9.00 30.25 6.50 2.50 6 P2:P1 262.75 240.50 241.49 236.73 26.50 14.75 6.00 1.00 5 IP1:P2 422.75 136.25 334.38 144.07 15.50 18.50 .50 13.50 16 P1:P5 165.00 215.25 216.69 261.11 7.50 29.75 .25 8.50 6 P5:IP1 243.00 218.75 240.79 237.46 30.50 7.50 8.25 .50 16 P1:P2 220.25 232.50 236.19 239.11 14.00 25.00 1.75 7.75 5 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability MJ106 P2:IP1 913.50 190.25 386.43 92.38 30.25 9.25 2.25 6.75 18 P1:P5 351.75 710.00 167.98 310.68 6.50 32.00 1.50 6.50 11 1 6 8 IP2:P1 1026.25 41.50 444.90 33.74 32.50 6.25 .75 9.00 9 P5:P1 650.50 361.00 299.97 179.23 30.25 7.00 9.25 1.00 13 P1:P2 324.50 582.50 185.43 293.74 12.75 28.00 2.75 4.75 18 P1:IP5 47.75 937.00 51.31 427.12 6.50 36.00 1.25 2.00 7 IP1:P5 219.75 848.00 109.38 368.11 5.25 34.75 2.50 4.25 7 P2:P1 493.00 424.25 250.92 227.86 24.00 13.25 8.50 2.75 7 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability GR108 P1:P2 583.00 620.25 231.90 247.41 14.50 23.00 1.50 9.25 13 P5:IP1 934.25 386.75 322.63 156.67 34.50 6.50 4.75 1.00 8 1 6 9 IP1:P2 347.00 811.00 159.15 320.29 13.25 28.50 2.25 5.00 6 P2:P1 712.00 560.50 266.51 212.69 26.75 12.75 6.00 3.25 6 P5:P1 789.00 501.75 285.92 193.58 31.50 7.25 8.00 .75 5 P1:IP2 142.50 1063.75 95.08 384.26 12.75 29.00 2.75 3.00 11 IP5:P1 1279.00 114.25 420.11 59.37 38.00 5.75 1.00 2.25 7 P1:P5 482.50 789.00 186.83 292.29 6.75 30.00 1.25 8.75 6 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability 3 JE109 TP5:TP1 272.75 168.00 291.38 187.95 32.25 7.00 6.75 .75 18 TN5:TP1 376.50 49.00 404.63 73.82 35.50 6.25 3.00 1.50 10 1 7 0 TP1:TN2 76.25 312.75 118.36 360.51 12.50 31.00 3.25 2.75 8 TP2:TP1 217.75 193.25 247.97 231.23 25.00 14.00 7.25 1.75 15 TN1:TP2 124.25 293.50 151.89 327.21 13.75 28.75 2.00 4.50 7 TP1:TP5 104.25 321.25 130.96 348.31 7.00 34.25 1.00 5.50 8 TP5:TN1 349.25 73.50 382.84 96.25 34.50 7.00 5.25 1.00 6 TP1:TP2 177.25 235.25 215.59 263.52 13.25 27.25 2.75 5.00 11 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability SL110 TP2:TN1 490.25 281.50 290.31 188.75 29.25 12.25 3.75 3.50 8 TP1:TP5 113.50 813.50 78.90 400.24 6.00 35.25 1.75 3.00 10 1 7 1 TN2:TP1 432.25 300.75 277.34 201.51 25.25 14.25 6.75 1.75 9 TP5:TP1 866.50 126.50 398.16 80.93 36.50 6.50 1.25 1.25 7 TP1:TP2 358.25 683.50 172.68 306.52 12.25 26.50 3.50 5.75 20 TP1:TN5 97.50 878.25 73.56 405.65 6.50 37.00 1.25 1.50 14 TN1:TP5 22.00 1082.00 23.30 456.12 5.50 38.25 2.25 1.00 6 TP2:TP1 711.25 254.75 338.45 140.80 30.00 11.25 3.00 4.50 21 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability NN111 TP1:TP2 483.00 697.75 205.55 273.91 11.00 26.25 4.25 6.50 9 TP5:TN1 878.00 46.25 452.85 26.64 38.50 4.75 .75 2.75 19 1 7 2 TN1:TP2 221.50 1074.50 97.81 381.50 11.00 28.75 4.50 3.75 13 TP2:TP1 710.25 520.25 275.94 203.42 27.25 13.50 5.00 2.50 14 TP5:TP1 1185.50 138.75 422.44 57.05 36.00 5.50 2.50 2.50 13 TP1:TN2 55.00 1235.00 36.14 443.36 9.00 32.75 6.50 .50 9 TN5:TP1 1229.50 74.00 446.17 33.29 38.75 5.25 00.00 2.50 9 TP1:TP5 227.25 818.50 105.43 374.03 5.00 33.75 2.25 5.25 13 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability 4 MJ112 TP5:TP1 427.75 47.00 425.57 53.72 35.50 4.75 3.00 3.00 10 TN1:TN2 126.00 310.00 144.52 334.80 9.00 26.50 6.25 6.50 5 1 7 3 TP1:TP2 120.50 330.50 132.73 346.68 10.25 27.25 5.50 6.50 14 TN5:TN1 439.50 24.25 450.03 29.09 37.75 4.25 1.00 3.00 15 TP2:TP1 388.25 57.50 410.52 68.72 29.50 9.00 2.75 6.75 10 TP1:TP5 12.75 447.75 17.37 462.01 4.25 37.75 3.00 .75 8 TN2:TN1 370.50 66.25 395.94 83.61 30.25 11.25 1.75 4.50 15 TN1:TN5 37.25 422.50 43.62 435.82 6.25 36.75 1.75 1.75 5 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability JL113 TP1:TP2 453.50 711.75 189.78 289.58 12.75 26.75 3.25 5.00 9 TN5:TN1 666.25 135.50 380.91 98.44 36.75 7.00 2.25 .75 7 1 7 4 TN2:TN1 443.50 159.00 332.54 146.89 30.25 14.00 2.75 1.75 13 TP1:TP5 79.50 388.00 95.70 383.76 6.75 36.5 1.00 2.50 9 TP5:TP1 542.50 67.00 408.28 71.06 37.00 6.75 2.00 1.25 7 TN1:TN2 189.50 682.00 124.18 355.29 13.75 29.00 2.25 3.50 23 TP2:TP1 824.50 201.00 363.54 116.00 31.50 12.50 1.75 3.25 20 TN1:TN5 76.25 969.75 58.97 420.32 7.25 37.50 .50 1.75 5 (Table Continues) Table (con?t) Number of Number of Number of Time in Reinforcers Reinforcers Number of Condition Responses Schedule (s) Obtained Missed Sessions Exp. Subject (Left:Right) Left Right Left Right Left Right Left Right to Stability SS114 TP1:TP5 160.50 427.75 143.49 335.68 7.50 35.00 .50 3.50 16 TP2:TP1 508.00 232.50 287.80 191.29 30.75 14.75 3.50 1.25 9 1 7 5 TN5:TN1 981.50 95.75 383.22 95.92 36.50 7.00 2.50 1.00 6 TP1:TP2 242.25 455.00 188.18 290.88 14.00 29.50 1.75 3.75 17 TN1:TN2 203.50 725.00 132.74 346.79 14.25 29.25 1.50 3.00 19 TN1:TN5 177.25 711.25 132.48 347.03 7.25 35.75 .50 3.25 5 TP5:TP1 535.25 115.50 378.48 100.88 36.00 7.00 2.75 .75 13 TN2:TN1 685.25 306.25 308.07 171.26 29.00 14.25 2.75 1.75 9 176 CHAPTER 3 EXTENDED GENERAL DISCUSSION Placing the present results into proper context requires a brief review of the prevailing models of free-operant choice. The generalized matching relation (GMR) is the most widely accepted account, but it has two major limitations. First, the GMR makes limited predictions, constrained specifically to the relationship between behavior (and time) allocation and relative reinforcement frequency. It is silent on the differential effects of all other potential choice controlling variables. Although the estimates of the fitted parameters of the GMR offer measures of these effects, the GMR makes no predictions about the means by which those variables interact to determine the parameter values. Second, the GMR theoretically applies only to free-operant choice. Davison and Nevin?s (1999) quantitative model of contingency discriminability theory (CDT), the discriminative law of effect (DLOE), improves upon the GMR in both respects. First, the most notable conceptual improvement of the DLOE over the GMR is that it distinguishes between antecedent stimulus effects and consequence effects on behavior. In effect, it partitions the a parameter of the GMR, which is really a catch-all measure of anything that influences sensitivity to reinforcement, into d sb and d br , and proposes mechanisms by which they differentially influence responding (stimulus generalization and response induction, respectively). It is in this sense that the DLOE makes more specific predictions than the GMR. Second, the DLOE is more general than 177 the GMR. It is able to account for many data from the extensive conditional discrimination literature because it explicitly invokes conditional stimulus control effects. That said, it should be noted that a variant of the GMR, the concatenated GMR (Baum & Rachlin, 1969; Rachlin, Logue, Gibbon, & Frankel, 1986; see Davison & McCarthy, 1988 for a review), explicitly invokes several important response- consequence choice-controlling variables (e.g., reinforcer magnitude, delay, or quality). Unfortunately, this model has not received much empirical attention, leaving it largely a conceptual exercise to date. Moreover, the model has been criticized as being tautological (for an argument that exceeds the scope of the present discussion, see Rachlin, 1971). The DLOE does not explicitly address all of the choice-controlling variables mentioned in the concatenated GMR. It states simply that the parameter representing response-consequence contingency discriminability, d br , will influence the degree of response induction. Consequently, Davison and Nevin (1999) acknowledge, ?a substantial research effort will be required to identify independent functions for the discriminability and value of the consequences of choice? (p. 472). The implication, therefore, is that for now the model is restricted to making qualitative predictions regarding the effects of other response-consequence choice-controlling variables, but that it is at least principle possible to specify quantitatively how they interact to contribute to d br . The present investigation provides a data set that might be an early step toward that aim. It confirms the qualitative prediction made by Davison and Nevin (1999, pp. 444- 445) that concurrently available different types of reinforcers should increase behavioral sensitivity as suggested by research on the differential outcomes effect (Trapold, 1970). These data, in combination with future data, should provide a basis for future quantitative 178 analyses of how reinforcement type interacts with other response-consequence variables to influence d br . The present investigation also provides a procedure and general research strategy that may prove useful in further addressing the concatenation of the variables assumed to influence d br . Studies of Reinforcer Magnitude as a Framework for Studies of Reinforcer Type While there have been studies in the context of the matching relation involving concurrently available different types of reinforcement (e.g., Farley & Fantino, 1978; Hollard & Davison, 1971; Logue & deVilliers, 1981; Miller, 1976; Ruddle, Bradshaw, Szabadi, & Foster, 1982), the primary purpose of those studies was to demonstrate the ubiquity of the matching relation rather than to ask questions regarding the relationships among choice-controlling variables (cf. Bron, Sumpter, Foster, & Temple, 2003). Furthermore, from the perspective of the present investigation, those studies suffered from either of two limitations. First, some studies employed nonhuman animals as subjects, requiring the use of reinforcers that cannot be easily compared on a per-unit basis (e.g., food and electric current, Hollard & Davison, 1971; Logue & deVilliers, 1981). Second, some studies used easily scalable conditioned reinforcers of different types, but did not employ a control phase of concurrently available same type reinforcers by which to judge relative effects (e.g., Ruddle, et al., 1982). The present investigation and closely related studies (Critchfield & Lane, 2005; Magoon & Critchfield, 2005) are the first to use easily scalable reinforcers to make within-subject comparisons of the effects of concurrently available different types of reinforcers in relation to those of concurrently available same type reinforcers in free-operant choice. Several recent studies, however, have examined the effects of magnitude of reinforcement using similar 179 procedures. A brief review of those studies might inform the means by which DLOE model development, in terms of d br , might proceed to include reinforcement type. Landon, Davison, and Elliffe (2003) arranged a concurrent schedule switching- key procedure in which pigeons' key-pecking responses were reinforced. Relative frequency of reinforcement and overall absolute magnitude of reinforcement were held equal and constant across conditions but relative magnitude was varied. Landon et al. found that pigeon?s pecking behavior undermatched relative reinforcer magnitude ratios and did so to a greater degree than the same birds undermatched to relative reinforcer frequency in a different study involving the same subjects (see Landon, et al., 2003, Table 2 for the relevant comparison). Two of their conclusions were that behavior is less sensitive to reinforcer magnitude than it is to reinforcer frequency and that the effects of these two variables are independent. McLean & Blampied (2001) arranged a two-key concurrent schedules procedure in which pigeons? key-pecking responses were reinforced. Relative frequency of reinforcement was varied across identical ranges for two phases, one in which the concurrently available reinforcers were of identical magnitude, and one in which the concurrently available reinforcers were of different magnitude. In the phase with different magnitudes of reinforcement, there was a pronounced bias for the larger magnitude reinforcers, while there was no systematic bias in the other phase. There were no systematic slope differences between the two phases. While McLean and Blampied?s results corresponded well with Landon et al.?s (2003) conclusion that reinforcement frequency and magnitude effect behavior independently, their results did not agree with 180 Landon et al.?s conclusion that behavior is less sensitive to reinforcer magnitude than it is to reinforcer frequency. Critchfield & Merrill (2005) extended McLean and Blampied?s work to human subjects using procedures modeled closely after those of the present investigation. One of their phases arranged concurrently available schedules of positive reinforcement with equal magnitude reinforcers and the other arranged concurrently available schedules of positive reinforcement with different magnitudes of reinforcement. Their results mirrored those of McLean and Blampied (2001) in that the phase with the different magnitudes of reinforcement showed a clear bias toward the larger magnitude reinforcers, the other phase showed no bias, and there were no slope differences between phases. The three studies just discussed illustrate how within subject comparisons of matching functions can be used to analyze the separate effects of frequency and magnitude of reinforcement on behavior. The results of all three studies converge on the conclusion that magnitude of reinforcement exerts an independent effect from that of reinforcement frequency. Landon et al. (2003) came to this conclusion because nonzero slopes resulted when only relative magnitude of reinforcement was varied. McLean and Blampied (2001) and Critchfield and Merrill (2005) came to the same conclusion when there were intercept but not slope differences between phases when the range of relative reinforcement frequency was held constant across phases but magnitude was the same within one phases and different within the other. In all three cases, the parameters of matching functions were the metrics by which effects were evaluated. The present investigation adopted this same general strategy to assess independent and interactive effects between frequency and type of reinforcement. The difference 181 between outcomes of Experiments 1-3 and Experiment 4 suggest independence of reinforcer-frequency effects and reinforcer-type effects that is similar to the independence found between reinforcer-frequency effects and reinforcer-magnitude effects. Experiment 1 (like Critchfield & Lane, 2005, and Magoon & Critchfield, 2005) demonstrated this independence generally, while Experiments 2 and 3 attempted to determine which specific aspects of the positive and negative reinforcement contingencies (i.e., gain vs. loss and/or feedback asymmetry) were responsible for reinforcer-type effects. Hybrid contingencies were arranged such that reinforcers differed only in terms of money gain versus money loss (Experiment 3), or only in terms of feedback asymmetry (Experiment 2). Both factors proved to be sufficient and neither proved necessary to create a slope effect. It would be premature to assume that the effects resulting from positive and negative reinforcement manipulation apply to all instances of concurrently available different types of reinforcement given that analogous manipulations with other pairs of nonidentical reinforcer types are virtually nonexistent in the literature (cf. Bron et al., 2003). That feedback asymmetry and money gain vs. money loss created the slope effect of interest in this investigation lends optimism to the notion that the present findings reflect a general phenomenon, but only new studies using different types of reinforcement other than positive versus negative can address this more directly. As an example, consider a study with human subjects that arranged two-ply concurrent schedules of positive reinforcement with points exchangeable for money in both components in one phase (as in the ?baseline? phases of Experiments 1 and 2 of the present investigation), and concurrent schedules of positive reinforcement with points 182 exchangeable for college course extra credit in one component (with college students as subjects, of course) and points exchangeable for money in the other component in the other phase. The DLOE predicts that the latter phase would result in a matching function steeper than that of the former. The scaling of the consequences to each other would not be important to the experimental question as the primary interest would be type difference effects on slopes, though it would be interesting to see if there was any bias. Replication of the slope differences as seen in Experiments 1-3 of the present investigation, Magoon and Critchfield (2005), and Critchfield and Lane (2005) would further support the proposition that any reinforcer type difference between concurrent operants enhances response-consequence contingency discriminability and therefore sensitivity to reinforcement. Similar experiments can be easily imagined using nonhumans as subjects. Most generally, the literature just reviewed indicates that traditional, free-operant, concurrent schedules procedures are well suited to evaluating the independent effects of reinforcement frequency and other choice-controlling variables. Separate studies have shown that reinforcement-frequency effects are independent of both reinforcement- magnitude effects and reinforcement-type effects. Currently unknown is whether these two types of effects are independent of each other. Anecdotal evidence suggests that reinforcer-type manipulations are functionally different from reinforcer-magnitude manipulations. Both McLean and Blampied (2001) and Critchfield and Merrill (2005) found that reinforcer-magnitude manipulations did not create slope effects like those created in the present investigation by reinforcer-type manipulations. The DLOE apparently does not predict this distinction (note that, in 183 general, the seminal article of Davidson and Nevin, 1999, is vague about the exact role of reinforcement magnitude in the model). In the conceptual framework of the DLOE, implications of the findings just described are clear: Differences of reinforcer type appear to qualify as differential outcomes, while differences of reinforcer magnitude do not. This suggests the need for additional research. It can be predicted that when both reinforcer magnitude and reinforcer type are manipulated parametrically, their effects will be independent. Relevant experiments would be easy to design. For example, reinforcer frequency and magnitude could be held constant across phases, while reinforcer type was manipulated. The ?baseline? of such an experiment could consist of concurrently available schedules of positive reinforcement with relative reinforcement frequency ratio fixed at 1:1 and relative magnitude of reinforcement ratios varied across a range of values (as in Landon et al., 2003). The ?experimental? phase could consist of identical schedules except that reinforcer type (e.g., positive vs. negative) would be different for the two concurrent operants. The findings of Landon et al. (2003) predict a nonzero slope for the ?baseline? function, based strictly on the effects of varied relative reinforcer magnitudes. Any increase in slope in the experimental phase would indicate an additive effect of reinforcer type. Extending this general research strategy to other response- consequence choice-controlling variables and their various combinations could provide the means by which d br could be unpacked and the DLOE could be expanded into a concatenated DLOE. 184 Comments on the DLOE On reinforcer type versus magnitude. The DLOE predicts that any difference between response-consequence contingencies of concurrent operants should heighten discriminability and increase behavioral sensitivity. The findings of Critchfield and Merrill (2005) and McLean and Blampied (2001) clearly indicate this is not universally true. Their concurrent different-reinforcer phases consisted of different magnitudes of reinforcement, ?+6? and ?+2? in the former case and 6 sec vs. 2 sec access to food in the latter. Although both studies showed a systematic bias toward the larger reinforcer, neither showed an increased sensitivity. This state of affairs seems irreconcilable with the DLOE specifically and CDT generally. As Critchfield and Merrill (2005) noted, ?it seems illogical ? particularly in the context of contingency discrimination theory ? that a large reinforcer can be preferred over a small one without being simultaneously discriminated from it? (p. 17). Thus, in some cases concurrently available different reinforcement increases sensitivity and in other cases it does not. It is worth investigating the various dimensions of ?type? of reinforcement that are capable of producing the effects found in the current investigation. Reinforcer type differences and bias. Another issue raised by the present results concerns the systematic bias found for all subjects in Experiment 2. Figure 1 illustrates the intercept values (top panel) and between-phase intercept differences (bottom panel) for each subject in the four experiments. In the top panel of Figure 1, dark bars represent the intercepts of ?baseline? phases and light bars represent the intercepts of ?experimental? phases. In the lower panel of Figure 5, bars represent intercept differences between regression functions ("experimental" phase intercept minus 185 ?baseline? phase intercept). In light of Magoon and Critchfield?s (2005) results regarding bias, the nonsystematic results of Experiments 1, 3, and 4 are unsurprising, but the results from Experiment 2 clearly indicate a systematic bias toward ?inverse? positive reinforcement. On first consideration, since the effects in Experiment 2 are similar to those from experiments where magnitude and frequency were manipulated (e.g., Critchfield & Merrill, 2005; McLean & Blampied, 2001), it is possible to speculate that the different magnitude reinforcers used in Experiment 2 (i.e., 2? for all conditions) somehow interacted with frequency in a similar way to how concurrently available different magnitude reinforcers do. This seems unlikely, though, in the context of Experiment 4 that also used different magnitude reinforcers (i.e., 7.5?). Aside from this, there is no apparent explanation for these effects, but current procedures seem well suited to examine further this outcome. As an example of how they might, consider an arrangement where concurrent ?standard? negative reinforcement was programmed for both response options in the ?baseline? phases, and ?inverse? negative reinforcement was programmed for one option and ?standard? negative reinforcement was programmed for the other in the ?experimental? phases (admittedly subject retention would be an issue to overcome). If the same bias was found toward the ?inverse? negative reinforcement phase as was found toward the ?inverse? positive reinforcement phase in Experiment 2 of the current investigation, that might indicate that feedback asymmetry affects bias, but only when there are no ?gains and losses? differences. This is another example of the flexibility of the current procedures to address a variety of research questions. 186 On the provisional status of the DLOE. Although Davison and Nevin?s (1999) quantitative model of the DLOE appears to be an improvement over previous models of choice in terms of its increased precision and scope, even its authors acknowledge that, ?the probability of [their] having lit upon the correct model [is] probably something rather less than 1 in 10 6 ? (Davison & Nevin, 1999, p. 475). A viable quantitative statement of CDT is probably a long way off. Given the number of variables invoked and the range of outcomes to which it is designed to be applied, even some of the qualitative statements of the DLOE will undoubtedly need to be refined. Bearing this in mind, several conceptual critiques of the DLOE warrant mention. The first is specific to the concerns of the present investigation and other work with concurrent schedules. Davison and Nevin (1999) argued that, ?conventional two- key concurrent schedules are arranged in the presence of a single stimulus condition, so S 1 and S 1 are equivalent? (p. 469), thereby allowing the parameter representing stimulus- response contingency discriminability (d sb ) to be dropped. In quantitative terms, this yields Equation 18 in Chapter 1 (also Equation 18 in Davison and Nevin's report). This simplifying assumption, while conceptually and practically convenient, appears to be at odds with the claim that, ?response differentiation will be reflected in both parameters: For example,?, increasing the separation between correct responses would increase both d sb and d br ? (Davison & Nevin, 1999, p. 445). If it is true that response differentiation increases d sb , then it seems difficult to justify Equation 18 under any concurrently available two-response procedure. In fact, the only circumstance where it might be plausible to consider d sb = 1 would be in an unsignaled Findley switching-key procedure (Findley, 1958). However, this apparent contradiction does not change the fundamental 187 premise of CDT proper that variables that increase response-consequence contingency discriminability should heighten behavioral sensitivity. A test of this qualitative prediction would only seem to require that all possible effort be made to minimize d sb effects and to hold any antecedent stimuli constant across conditions. That was the approach taken in the present investigation. A related issue is that the model does not seem able to account for discriminative effects of reinforcement frequency. Most fundamentally, the model (refer to Equation 18 for ease of understanding) allows relative frequency of reinforcement (R 1 :R 2 ) to vary even in the case that d br is one. This prediction makes intuitive sense, in that if an organism is unable to discriminate which response results in reinforcement, all other things being equal, responding should be equally distributed despite changing relative reinforcement frequencies. However, in light of what is known about the discriminative functions of reinforcement (e.g., Lattal, 1975), it seems that any reinforcement discrepancies across alternatives should result in d br being at least marginally larger than one. Furthermore, just from an intuitive sense of the model, it seems as if the discriminative functions of reinforcement should contribute to d sb as well as to d br . Since reinforcers can serve a discriminative function, any time d br is >1 because of reinforcement (not response) factors, the potential exists for d sb also to be > 1. The model appears to imply that the values of d sb and d br are determined independently, though once determined, they are allowed to interact. Concatenating the DLOE. Two other issues must be addressed if there is to be future research into the interaction of response-consequence choice-controlling variables. The concatenated GMR grants equivalent weight and function to all dimensions of 188 reinforcement (e.g., magnitude, delay, type, and frequency), but the DLOE seems to imply that all dimensions but frequency interact with each other to determine d br and only then interact with frequency. This conceptualization does not give equal weight and function to all dimensions of reinforcement. It, in effect, makes all other reinforcement dimension subservient to frequency. Of course, it is an empirical question whether the dimensions of reinforcement enter in to the relation as specified by the concatenated GMR or by some other means. In fact, in light of Critchfield and Merril?s (2005) results, it may be that magnitude interacts directly with bias (log c) but not with d br , though again, as pointed out by Critchfield and Merrill, it seems logically difficult to accept that magnitude could influence preference (i.e., bias) without the difference being discriminated. Furthermore, Landon et al.?s (2003) results suggest that frequency manipulations are not required to heighten response-consequence contingency discriminability and thus sensitivity. These empirical results need to be rectified with the current conceptualization of the DLOE. The manner in which this is done is the subject of the final issue of consideration. Quantitative means by which to evaluate d br in relation to a seem of the utmost importance in further model development. Thus far, such comparisons have largely focused on the relative variance accounted for by the two parameters (e.g., Davison & Jenkins, 1985). While this is informative, parametric investigations into the relationship between them would seem to offer much greater benefits to developing a concatenated DLOE. 189 REFERENCES Baum, W. M., & Rachlin, H. C. (1969). Choice as time allocation. Journal of the Experimental Analysis of Behavior, 12, 861-874. Bron, A., Sumpter, C. E., Foster, T. M., & Temple, W. (2003). Contingency discriminability, matching, and bias in the concurrent-schedule responding of possums (Trichosurus Vulpecula). Journal of the Experimental Analysis of Behavior, 289-306. Critchfield, T. S., & Lane, S. D. (2005). Is bad stronger than good? A search for differential impact effects in concurrent schedules, delay discounting, probability discounting, and risky choice. Manuscript submitted for publication. Critchfield, T. S., & Merrill, D. (2005). Human concurrent schedule performance under heterogeneous reinforcement: Testing predictions of matching and contingency discriminability models. Manuscript submitted for publication. Davison, M., & Jenkins, P. E. (1985). Stimulus discriminability, contingency discriminability, and schedule performance. Animal Learning & Behavior, 13, 77-84. Davison, M., & McCarthy, D. (1988). The matching law: A research review. Hillsdale, NJ: Erlbaum. Davison, M., & Nevin, J. A. (1999). Stimuli, reinforcers, and behavior: An integration. Journal of the Experimental Analysis of Behavior, 71, 439-482. Findley, J. D. (1958). Preference and switching under concurrent scheduling. Journal of the Experimental Analysis of Behavior, 1, 123-144. 190 Hollard, V., & Davison, M. C. (1971). Preference for qualitatively different reinforcers. Journal of the Experimental Analysis of Behavior, 16, 375-380. Landon, J., Davison, M., & Elliffe, D. (2003). Concurrent schedules: Reinforcer magnitude effects. Journal of the Experimental Analysis of Behavior, 79, 351- 365. Lattal, K. A. (1975). Reinforcement contingencies as discriminative stimuli. Journal of the Experimental Analysis of Behavior, 23, 241-246. Logue, A. W., & deVilliers, P. A. (1981). Matching of behavior maintained by concurrent shock avoidance and food reinforcement. Behaviour Analysis Letters, 1, 247-258. Magoon, M. A., & Critchfield, T. S. (2005). Concurrent-schedules of positive and negative reinforcement: Differential-impact and differential-outcomes effects. Manuscript submitted for publication. McLean, A. P., & Blampied, N. M. (2001). Sensitivity to relative reinforcer rate in concurrent schedules: Independence from relative and absolute reinforcer duration. Journal of the Experimental Analysis of Behavior, 75, 25-42. Miller, H. L. (1976). Matching-based hedonic scaling in the pigeon. Journal of the Experimental Analysis of Behavior, 26, 335-347. Rachlin, H. C. (1971). On the tautology of the matching law. Journal of the Experimental Analysis of Behavior, 15, 249-251. Rachlin, H. C., Logue, A. W., Gibbon, J., & Frankel, M. (1986). Cognition and behavior in studies of choice. Psychological Review, 93(1), 33-45. 191 Ruddle, H. V., Bradshaw, C. M., Szabadi, E., & Foster, T. M. (1982). Performance of humans in concurrent avoidance/positive-reinforcement schedules. Journal of the Experimental Analysis of Behavior, 38, 51-61. Trapold, M. A. (1970). Are expectancies based upon different positive reinforcing events discriminably different? Learning and Motivation, 1, 129-140. 192 Figure Caption Figure 1. Intercept estimates (top panel) and intercept estimate differences (bottom panel) of each matching function for each subject of each experiment. In the top panel, dark bars represent the intercept estimates for the ?baseline? phase matching functions and light bars represent the intercept estimates for the ?experimental? phase matching functions. In the bottom panel, bars represent the difference between the ?experimental? matching function intercept estimate and the ?baseline? matching function intercept estimate. Figure 1 -0.5 0.0 0.5 1.0 1.5 2.0 TB101 TM102 SO104 GE105 MJ106 GR108 JE109 SL110 NN111 MJ112 JL113 SS114 Exp. 1 (N:P) Exp. 2 (IP:P) Exp. 3 (TN:TP) Exp. 4 (TN:TN) In te r c e p t Baseline Experimental -0.5 0.0 0.5 1.0 1.5 2.0 TB101 TM102 SO104 GE105 MJ106 GR108 JE109 SL110 NN111 MJ112 JL113 SS114 Exp. 1 (N:P) Exp. 2 (IP:P) Exp. 3 (TN:TP) Exp. 4 (TN:TN) In t e r c e p t s D i ff e r e n c e Series1 193