, Daina Economidou1, Yann Pelloux1 and Barry J. Everitt1
(1)
Department of Experimental Psychology, University of Cambridge, Cambridge, UK
(2)
UKAVENIR team Psychobiology of Compulsive Disorders, Pôle Biologie Santé CNRS UMR 6187 & Université de Poitiers, Poitiers, France
Abstract
Our increasing understanding of the psychological mechanisms involved in the transition from controlled to habitual compulsive drug use, the hallmark of drug addiction, relies on animal models in which the underlying behavioral construct reflects some of the main features of drug addiction in humans, such as foraging for the drug during extended periods of time, habitual drug seeking behavior and drug seeking or drug taking behaviors that are maintained despite adverse consequences. We have placed great emphasis on the development of behavioral procedures whereby animals not only self-administer drugs, but pathologically seek and take drugs in a way that resembles the clinical condition in human drug addicts. Thus, over the last 10 years we have developed models in rats that specifically address the development of habitual drug seeking behavior, compulsive cocaine seeking and taking behavior, and even addiction-like behavior. In this chapter, we review the behavioral procedures, namely second-order schedules of reinforcement, two-link heterogeneous chained schedules of reinforcement and the “three addiction-like behavioral criteria selection procedure” that we have used in rats to model habitual drug seeking behavior, compulsive drug seeking and taking behavior and addiction-like behavior. Although not yet widely adopted, these models have already contributed to the identification of some neurobiological and psychological mechanisms involved in the vulnerability to drug addiction and the transition from controlled to compulsive drug use, thereby emphasizing their great heuristic value in attempts to understand drug addiction.
Key words
CocaineStriatumDopaminePrefrontal cortexHabitsCompulsivityRatsSecond-order schedules of reinforcementSeeking–taking schedules of reinforcement1 Introduction
All drugs abused by humans have reinforcing properties in many species, including planarians (1) and flies (2, 3), and they are readily self-administered by vertebrates such as mice (4–8) or rats (9–13), dogs (14), and nonhuman primates (15–21). Thus, animal models of sensitivity to the reinforcing properties of addictive drugs as well as the initiation and maintenance of drug taking (see previous chapters) have great heuristic value for the pharmacology of drug addiction since they show both face and predictive validities.
These animal models have increased our understanding considerably of the psychological, neural, cellular, and molecular mechanisms whereby addictive drugs exert their reinforcing effects (22–30), but they do not address the transition from controlled to habitual and compulsive drug use, the hallmark of drug addiction (Table 1). Indeed, among the individuals exposed to drugs, and there are many who occasionally drink only a glass or two of an alcoholic beverage, or smoke a cigarette or two, only 15–30% overall will switch from casual, “recreational” drug use to drug abuse and drug addiction (47) (Fig. 1). These drug addicts not only take drugs, they spend great amounts of time foraging for their drugs, compulsively take drugs, lose control over drug intake, and persist in taking drugs despite the many adverse consequences of doing so, including compromising their health, family relationships, friendships, and work. Many drug addicts resort to criminal behavior to obtain the funds necessary to sustain their compulsive drug use and the great majority eventually relapse to drug use even after prolonged periods of abstinence.
Table 1
Animal models of habitual and compulsive drug use in reference to the DSM-IV criteria (Adapted from (31))
Fig. 1.
Interindividual differences in the vulnerability to develop drug addiction. A substantial proportion of the general population experiences drugs at least once in a lifetime. Of the recreational users who control their drug intake, some will shift to more chronic drug use. Only a subgroup of these individuals will develop drug abuse and eventually drug addiction. Epidemiological studies reveal that, of the individuals who have been exposed to addictive drugs, 15–20% eventually develop drug addiction (47). Drug addicts compulsively take drugs in that they maintain drug use despite obvious social, health, and professional adverse consequences, and increasing evidence suggest that there might be a shift from impulsivity to compulsivity in the control over drug taking during the development of drug addiction.
This negative behavioral picture illustrates how drug addiction is not merely a drug taking disorder, but is also defined as a chronic relapsing disorder characterized by loss of control over drug intake and compulsive drug seeking and taking that is maintained despite adverse consequences, as described by the seven diagnostic criteria in the DSM-IV (31) (Table 1). Therefore, it is important to separate two issues: why people take drugs and why people compulsively take drugs (48). There is increasing evidence suggesting that drug addiction results from gradual adaptation processes in the brain of vulnerable subjects in response to chronic drug exposure, that may ultimately lead to a shift in the psychological mechanisms that govern drug seeking and drug taking behaviors, including habits (49, 50), aberrant instrumental learning mechanisms controlled by Pavlovian cues, failure in behavioral control (51, 52), decision making, and self-monitoring processes (53). Similarly, we have argued previously that, during the development of drug addiction, drug seeking is initially goal directed but becomes habitual, and ultimately compulsive, thereby emphasizing the potential importance of maladaptive automatic instrumental learning mechanisms and their control by Pavlovian incentive processes in the emergence of compulsive drug use (51, 54–56). At the neurobiological level, we have hypothesized that this progressive transition in the psychological mechanisms that govern drug seeking behavior may reflect a progressive shift from prefrontal to striatal mechanisms paralleled by a progression from ventral to dorsal striatum as the locus of control over behavior (55). Therefore, the investigation of the transition from controlled to compulsive drug use, which is one if not the most fundamental issue for research on drug addiction, explicitly raises the importance of the construct validity of animal models of habitual and compulsive drug seeking and drug taking behaviors.
Recently developed animal models of habitual and compulsive cocaine seeking and taking (37, 39, 40, 44, 46, 57, 58) have indeed proven to be useful experimental tools for investigating the psychobiological mechanisms underlying the transition from controlled drug use to drug addiction and have provided evidence to support the notion that a ventral to dorsal striatum shift occurs in the control over well established, or habitual, cocaine seeking (58–60).
In this chapter, we will thus review the contribution of animal models of habitual and compulsive drug seeking and taking to the advances made in the understanding of the pathophysiology of drug addiction. More precisely, we describe experimental procedures currently developed to model compulsive drug seeking and drug taking behavior and review their contribution to the understanding of the psychological mechanisms involved in the development of drug addiction. Finally, we will focus on an animal model of addiction-like behavior in the rat that integrates both the compulsive feature of drug addiction and interindividual differences in the vulnerability to develop drug addiction, thereby providing a powerful tool for the investigation of the behavioral and biological factors of vulnerability to drug addiction.
2 Habits in Drug Addiction: Theoretical Implications and Empirical Evidence
2.1 Theoretical Implication of Habits in Drug Addiction: Historical Background
Early theoretical arguments concerning the importance of “automatic processes” or “habits” in the development of drug addiction (50, 54, 61) were based on the observation that “drug-use behaviors tend to be relatively fast and efficient, readily enabled by particular stimulus configurations (i.e., stimulus bound), initiated and completed without intention, difficult to impede in the presence of triggering stimuli, effortless, and enacted in the absence of awareness.” These behavioral features resonate well with the five main characteristics of behavioral automaticity, often assumed to reflect habits that Tiffany extracted from a meta-analysis of the literature: speed, autonomy, lack of control, effortlessness, and absence of conscious awareness. Indeed, automatic processes are associated with an increased speed and decreased variability of performance. Thus in humans, habitual responses do not depend upon awareness but are directly triggered by conditioned stimuli without any recruitment of higher cognitive processes, such as intention or decision making, and are therefore difficult to inhibit in the presence of the eliciting stimuli. Additionally, habitual responses are relatively insensitive to variations in their consequences, as illustrated by William James’s description, in the late 19th century, of a clear dissociation between plans and actual actions in specific contexts: “very absent-minded persons in going to their bedroom to dress for dinner have been known to take off one garment after another and finally to get into bed, merely because that was the habitual issue of the first few movements when performed at late hours.” In this description, environmental stimuli (late hour, entering the bedroom) trigger an action that is dissociated from the initial goal, which can then be considered as devalued. Similarly, more recent studies have reported that driving habits are impervious to the value of the goal: individuals with a strong driving habit will, for example, continue using their car instead of shifting to public transportation to commute to work even when the highway they use for commuting is closed (62).
2.2 Drug Addiction as the “Bad Habit” of Drug Seeking and Drug Taking Behaviors
The aforementioned behavioral features of automation, or habits, show great construct validity when integrated into the psychobiological framework of drug addiction. Indeed, the development of habitual action schemata whereby drug seeking and taking are stimulus bound and beyond cognitive control may account for various behavioral features observed in human addicts, such as stimulus-bound relapse even after protracted abstinence, and stimulus-maintained drug seeking over prolonged periods of time.
These habits may even encompass higher schemata, themselves controlling goal-directed sequences of behavior, but nevertheless triggered specifically by drug-associated stimuli. The stimuli that trigger drug seeking are multiple and varied, ranging from drug-associated paraphernalia, locations, internal states, such as withdrawal or anxiety, friends, or the drug itself. Therefore inflexible habitual drug seeking responses can be elicited by either external (environmental) or internal (thoughts or internal states) stimuli. These drug-associated stimuli may eventually play another important role in supporting drug seeking behavior in addicts who, in real life, spend a great deal of time foraging for the drug, often involving long sequences of behavior during which the drug-associated cues, acting as conditioned reinforcers, bridge delays between drug seeking and drug taking.
2.3 From Actions to Habits to Compulsion: Progressive Psychological Shifts in the Development of Addiction
We have suggested that addictive drugs may subvert natural learning processes, such as action–outcome and stimulus–response (S–R) instrumental learning mechanisms, as well as Pavlovian-instrumental interactions, thereby facilitating the development of habitual control over drug seeking and drug taking (50). Thus, drug use that is initially a goal-directed action, controlled by the reinforcing properties of the drug, progressively becomes divorced from these reinforcing properties and more controlled by stimuli in the environment that have repeatedly been associated with the drug. The development of habitual drug use cannot alone account for the development of compulsive drug use, which is hypothesized to involve a failure in cortical top-down control mechanisms, but it may nevertheless play a central role in the development of drug addiction. We have thus developed the hypothesis that drug addiction results from progressive adaptations in the brain, which ultimately lead to a loss of executive control over maladaptive drug seeking and taking habits (55, 56).
2.4 Goal-Directed and Habitual Control over Instrumental Performance in Animals
It is now well established that the same instrumental response can be mediated by either a goal-directed (action–outcome, A–O) or a habitual (S–R) system (Fig. 2).
Fig. 2.
Psychological mechanisms involved in the control over drug seeking. (a) Schematic of the Pavlovian and instrumental associative learning processes involved in the acquisition of drug seeking and drug taking behaviors. (b) When drug taking behavior is under the control of action–outcome processes, the instrumental response is underpinned by an explicit representation of the motivational and/or sensory-specific properties of the drug. Action–outcome processes are likely to control drug seeking and drug taking behavior soon after the acquisition of self-administration and over protracted periods of training provided the contingency between the action and drug infusion is high, for example, under FR schedules in experimental settings. (c) When drug taking is habitual, that is, governed by S–R processes, the instrumental response is triggered by conditioned stimuli in the environment and becomes somehow divorced from the drug. Habitual drug seeking and taking behavior are predicted to occur after extended training under conditions in which there is a weak contingency between the action and the outcome, such as fixed intervals or second-order schedules of reinforcement.
Behavioral procedures have been developed to investigate the nature of the psychological substrate that governs instrumental behavior in animals. When instrumental responding is habitual, the animal’s instrumental performance is no longer an action in that it is not under the direct control of the representation of the motivational value of the outcome (action–outcome, A–O), but is instead an automatic response triggered by the stimuli associated with the outcome (S–R). Thus, habitual instrumental performance is impervious to affective devaluation of the outcome – usually by pre-feeding to satiety or lithium chloride-induced malaise – or contingency degradation, whereas these manipulations markedly reduce instrumental responding when it is controlled by an A–O process (Fig. 2). Thus, to demonstrate habitual control over instrumental performance requires a direct manipulation of the incentive properties of the reinforcer, a manipulation that has not yet been developed for non-ingestive reinforcers, such as intravenously self-administered drugs.
Dickinson and colleagues (63, 64) have defined the conditions whereby instrumental responding shifts from being controlled by a goal-directed to an habitual mechanism. Thus, overtraining under fixed-ratio schedules of reinforcement or limited exposure to interval schedules of reinforcement similarly result in the transition from A–O to S–R control over instrumental responding for natural rewards, that is, performance of the instrumental response is unaffected by devaluation of the outcome when tested in extinction (64).
At the neural systems level, A–O and S–R learning processes depend upon dissociable structures, as demonstrated by lesion or inactivation procedures in rats. Thus, lesions of the prelimbic cortex (65, 66) or glutamate receptor blockade in the dorsomedial striatum (67, 68) disrupt goal-directed responding, whereas lesions or inactivation of the infralimbic cortex (65, 66), the dorsolateral striatum (67, 69, 70), or its dopaminergic innervation (71) abolish S–R control over behavior, thereby rendering instrumental performance sensitive to the motivational value of the outcome in overtrained rats.
Although the potential implication of behavioral autonomy in the development of drug addiction was first suggested almost 20 years ago (61), it is only in the last 5 years that animal models of habitual drug seeking have been developed. Even though initially limited to oral drug self-administration, these animal models have provided the first clear experimental evidence that drugs of abuse such as cocaine and alcohol actually facilitate the development of the S–R process compared to natural rewards (72, 73).
3 Animal Models of Drug-Induced Habits: Oral Self-Administration and Behavioral Sensitization
3.1 Oral Drug Self-Administration
In a series of studies, Dickinson and colleagues have developed animal models of habitual oral self-administration of alcohol and cocaine (72, 73). In these two experiments, rats were initially trained to respond on one manipulandum (lever or rod) for a food pellet or a lemon-sucrose solution and a second manipulandum (lever or rod) for an alcohol (10%)- or cocaine (0.1%)-sucrose (10%) solution. Responses on each manipulandum were reinforced under RI schedules, in which the contingency between the response and the outcome is not explicit, thereby favoring the development of the S–R process in the control over instrumental performance. After ten training sessions during which animals had similar access to the natural or drug reward, aversion conditioning took place whereby each reinforcer was specifically devalued without interfering with the other reinforcer. For this, each animal in the natural reward-devalued group received noncontingent presentations of the natural reward followed by an intraperitoneal injection of an isotonic LiCl solution whereas rats in the drug-reward group received noncontingent presentations of the drug solution and then the LiCl IP injection. Animals were then subjected to a single (8 or 10 min) extinction session during which they had access to the two manipulanda and their instrumental responses were recorded. Since the devaluation procedure took place without presentation of the manipulanda, instrumental performance during extinction was not related to any direct effect of the aversion conditioning upon instrumental performance. Figure 3a, b illustrates the results of the two studies, emphasizing that rats orally self-administering alcohol and cocaine showed facilitated resistance to reinforcer devaluation compared to natural rewards. Thus, after equivalent training, consumption of addictive drug-containing solutions resulted in habitual control over instrumental performance at times when responding for natural rewards remained goal directed.
Fig. 3.
Oral drug self-administration facilitates the instantiation of habitual food seeking behavior. Rats orally self administering alcohol and cocaine showed facilitated resistance to reinforcer devaluation compared to natural rewards. Left panels: Instrumental performance for oral alcohol self-administration following affective devaluation of the outcome. Data presented are mean number of responses on the lever associated with pellets or ethanol during the extinction session (see text for details). Whereas specific devaluation of the pellet reward produced a decrease of responding on the associated lever, thereby showing that instrumental responding for this natural reward was goal-directed, the same manipulation failed to infl uence lever presses for ethanol, revealing that ethanol seeking was habitual. Right panels: Lever pressing for oral cocaine self-administration following devaluation of the outcomes (see text for details). Whereas a devaluation effect was observed on instrumental performance associated with the lemon-sucrose, the same manipulation failed to diminish instrumental performance associated with the cocaine-sucrose solution, thereby demonstrating that self-administration of cocaine facilitated the establishment of S–R control over instrumental performance (Left panels: Adapted from (72); Right panels: Adapted from (73)).
3.2 Behavioral Sensitization
Behavioral sensitization is a widely used animal model of drug addiction (see Chap. 7 in this book) that captures long lasting (74) pathological behavioral and biological adaptations to repeated exposure to addictive drugs. Thus, sensitization to cocaine or amphetamine develops in response to a sub-chronic treatment regimen and is characterized by potentiated locomotor and neurochemical responses (i.e., dopamine release) to a drug challenge after repeated experimenter-delivered infusions of this drug. These behavioral and neurochemical features have been suggested to reflect an increased sensitivity to the motivational, or “incentive” properties of drug-associated stimuli and the drug itself, as referred to in the “incentive sensitization” theory of drug addiction (75), thereby emphasizing the role of psychomotor sensitization in the development of drug addiction.
Behavioral sensitization has been related to alterations in dopamine release, synaptic plasticity, gene expression in both the ventral and the dorsal striatum (76), and especially the more lateral part of the dorsal striatum, the neurobiological locus of formation of S–R associations (67, 69, 70). Thus, on the basis of the neurobiological profile of cellular and molecular adaptations in response to repeated exposure to psychostimulants that includes the dorsolateral striatum, it has been hypothesized that behavioral sensitization may facilitate the development S–R habits. Using behavioral sensitization procedures, that is, daily IP injections of amphetamine (2–2.5 mg/kg daily, 5–7 days), the long-term (4–6 weeks) influence of drug exposure to the sensitivity to reinforcer-specific satiety and lithium chloride-induced nausea devaluation procedures has been measured (77, 78). In both cases, devaluation tests were performed after short-term lever-press training for two alternative reinforcers, one associated with each lever, under a random ratio schedule of reinforcement (30 or 60 s). After completion of the devaluation procedures, controlling for the specificity of the reinforcer, lever presses were measured during a 10 min extinction session. The results of these experiments are illustrated in Fig. 4.
Fig. 4.
Cocaine and amphetamine sensitization facilitates habit formation. (a) Lister hooded rats initially received one daily IP injection of amphetamine (2 mg/kg) (amphetamine group) or saline (vehicle group) for 7 days. Seven days after cessation of this sensitization regimen, rats were trained to enter a magazine to receive either a sucrose or maltodextrine solution (counterbalanced across treatment and devaluation group). Animals were then trained to respond on a lever, initially under continuous reinforcement and subsequently under a RI 30 s schedule of reinforcement. Animals’ exposure was equated to each of the two reinforcers prior to the devaluation test. Devaluation was performed by specific satiety (1 h access to the instrumental reinforcer) and both magazine entries and lever presses were measured under extinction (8 min). As illustrated, amphetamine-sensitized rats are resistant to this manipulation since they maintained lever pressing and magazine entries whereas vehicle-treated animals showed a marked decrease of their responses. (b) Rats initially received repeated injections of either saline or amphetamine and were subsequently trained to press two levers, each of which was paired with a specific liquid reward. Once stable behavioral performance had been established and rats discriminated well the two levers and their associated rewards, devaluation of one of the outcomes was carried out by three post-training injections of lithium chloride for one reinforcer, whereas non-devalued animals received saline infusions following presentation of the other reinforcer. When tested under extinction, control animals show greater lever pressing suppression than amphetamine-treated rats after devaluation soon after stabilization of performance (left) but not after extended training (right). (c) Long–Evans rats were subjected to a cocaine sensitization regimen with one daily IP injection of cocaine for 14 days while a vehicle group received injections of equivalent volumes of vehicle. Twenty-one days following cessation of the sensitization regimen, rats were trained to acquire a light-food pellet Pavlovian association for eight daily sessions at the end of which procedure they were assigned to a devalued and non-devalued group. Devaluation was induced by taste aversion (IP injection of lithium chloride following noncontingent presentation of food). Rats were tested under extinction where the food-associated CS was presented in the absence of food delivery. As illustrated, cocaine-sensitized rats trained under a Pavlovian learning task did not reduce conditioned approach responses toward a CS after devaluation of its associated US unlike control animals (a: Adapted from Nordquist et al. 2007; b: Adapted from (77); c: Adapted from (79)).
Nordquist and colleagues showed that after 12 sessions of training under a random-interval schedule, both amphetamine and saline-treated animals showed habitual control over instrumental performance since the satiety-devaluation procedure did not alter lever presses during the extinction session in each of these groups. However, after only six sessions of training, although saline-treated rats were sensitive to the devaluation procedure, amphetamine-treated rats maintained a level of responding for the devalued reinforcer that was similar to the non-devalued reinforcer, thereby revealing that instrumental performance was no longer under the control of the motivational value of the reinforcer. Similarly, Nelson and Killcross showed that amphetamine-treated rats did not diminish their active lever presses for the devalued reinforcer when tested under extinction after LiCl-induced nausea at a time when saline-treated rats were shown sensitive to devaluation.
These experiments strongly suggest that noncontingent exposure to addictive drugs, or at least psychostimulants, triggers neural mechanisms that facilitate the instantiation of inflexible habitual instrumental behavior.
Although interesting, these studies raise an important issue concerning the behavioral features of human drug addicts: drug addicts display habitual, inflexible, and compulsive behavior toward the drug, but not natural rewards. Thus, if addictive drugs facilitate habitual control over natural rewards in animal models, why is it that human addicts show compulsive habitual drug taking to the detriment of natural rewards? Interestingly, Nelson and Killcross showed that a post-training sensitization regimen did not affect the performance during extinction following the devaluation procedure compared to controls, suggesting that habit learning can be facilitated by exposure to addictive drugs, whereas performance of already acquired S–R habits are not. Although this experiment was initially designed to control for any effect of sensitization upon the expression, rather than learning, of goal-directed actions, these data may allow reconciliation of the animal and human literature. We speculate that, after the initial exposure to an addictive drug, only newly acquired drug-related activities, may be instantiated as inflexible habits.
Interestingly, cocaine sensitization also results in maladaptive perseverative responding after Pavlovian reinforcer devaluation (80) (Fig. 4c). Schoenbuam and Setlow showed that rats given daily IP cocaine injection (30 mg/kg) for 2 weeks and trained under a Pavlovian learning task 3 weeks after cessation of drug treatment did not reduce conditioned approach responses toward a CS after devaluation of its associated US unlike control animals that had received saline infusions. Thus, psychostimulant exposure not only enhances the development of rigid operant responses, but also the development of rigid Pavlovian approach responses.
However, in all the studies discussed above, all rats, including cocaine-sensitized rats, showed a consummatory aversion to the reinforcer after devaluation, revealing that drug exposure does not influence consummatory responses despite altering instrumental performance. It is therefore important to develop animal models that allow a dissociation between the instrumental, Pavlovian, and behavioral control mechanisms involved in voluntary, habitual, or compulsive drug self-administration. The experimental test of this hypothesis is dependent upon the ability to dissociate in animals preparatory from consummatory responses for addictive drugs, the latter being, at least for natural rewards, governed by Pavlovian mechanisms. We thus implemented procedures in rats that allow us to dissociate drug seeking from drug taking behavior, namely, second-order schedules of drug reinforcement (17, 41, 42, 81) and two-link heterogeneous chained schedules of reinforcement (39, 43, 82).
4 The Distinction Between Drug Seeking and Drug Taking Behavior: Second-Order and Two-Link Heterogeneous Chained Schedules of Reinforcement
In trying to separate drug seeking from drug taking, schedules of reinforcement must be implemented in which operant responding for the drug during the drug seeking phase is not affected by the drug itself, that is, so that drug seeking behavior can be measured without interference by stimulant or sedative actions of the self-administered drug.
Two-link heterogeneous chained schedules of reinforcement aim to dissociate spatially, temporally, and instrumentally drug seeking from drug taking behavior. Second-order schedules of reinforcement allow the investigation of cue-controlled drug seeking over prolonged periods of time.
4.1 Two-Link Heterogeneous Chain Schedules of Reinforcement
In this procedure, completion of the first link of the chain, designated the seeking link, results in access to the second, or taking, link which permits, once performed, the delivery of the reinforcer. Acquisition of the chain schedule is achieved through successive steps of increasing complexity which start with introduction of the taking lever. A lever press is then reinforced under a fixed-ratio (FR) 1 schedule so that each lever press produces drug reinforcement accompanied by the withdrawal of the taking lever. After several sessions of stable responding, the seeking lever is introduced while the taking lever is retracted. The first press on the seeking lever initiates a random interval (RI) schedule with the first seeking lever press occurring after the RI has elapsed, terminating the first link of the chain; this results in retraction of the seeking lever and insertion of the taking lever to initiate the second link. One press on the taking lever results in the presentation of the reinforcer followed by a time-out period. Thereafter, the seeking lever is reinserted to start the next cycle of the schedule. The effects of experimental manipulation can thus be assessed through measures of seeking responding (latency, number, or response rate) as well as taking responding (latency). The interest in dissociating seeking and taking behavior is obvious when considering that the two instrumental components are influenced by dissociable processes since they are differentially sensitive to devaluation, incentive learning, or Pavlovian manipulations (83). In addition, cocaine seeking performance is monotonically related to the dose of drug with a relatively long time-out (43), and is profoundly affected by extinction of the taking link (82).
4.2 Second-Order Schedule of Cocaine Reinforcement
In the street, drug seeking behavior is stimulus bound in that drug addicts forage for their drug under the control of stimuli in the environment, acting as conditioned reinforcers that support long sequences of behavior in the absence of the outcome. More formally, conditioned reinforcers are stimuli that have themselves acquired rewarding properties after repeated associations with unconditioned rewards. Conditioned reinforcers bridge delays between seeking and obtaining the drug. Psychostimulants, opiates, speedball, cannabis, or nicotine-associated CSs act as powerful conditioned reinforcers since they greatly enhance drug seeking behavior when presented contingently, but not noncontingently, upon instrumental responding during, usually, interval schedules of reinforcement (17, 41, 84–87). Conditioned reinforcers can also support the acquisition of a new instrumental response (88). Such properties are clearly demonstrated in procedures where animals work to obtain presentation of a conditioned stimulus, often in the absence of the unconditioned reward.
In second-order schedules of reinforcement that we have used, the CS is presented response-contingently usually under a FR schedule, during an overall fixed interval or FR schedule for the primary reinforcer, and markedly enhances and maintains responding for long periods of time (Fig. 5). Thus, under a second-order schedule of reinforcement, a strong contingency exists between the instrumental response and the presentation of the CS (under a FR) as well as the relatively weaker contingency that is arranged between instrumental performance and the outcome (the drug) that is reinforced only after completion of the first ratio after each interval has elapsed. Such schedules therefore facilitate the development of S-R control over instrumental responding. In addition, it has been shown that omission of CS presentation in second-order schedules of reinforcement disrupts cocaine seeking more than food seeking behavior (85), suggesting that prolonged psychostimulant seeking is particularly dependent upon conditioned reinforcement. Thus, instrumental responding during the first interval of a second-order schedule of reinforcement shows face and construct validities with regard to the behavioral features of drug seeking in humans: stimulus bound, somewhat dissociated from the unconditioned effects of the drug and long lasting.
Fig. 5.
Acquisition of cocaine seeking under a second-order schedule of reinforcement. Figure representing the instrumental performance of a population of rats during the first interval of a FI15 and FI15(FR10:S) schedule of reinforcement (see text for explanation). The acquisition of cocaine seeking under the influence of Pavlovian conditioned cues is a rather long process during which the animal is trained to respond on an active lever to seek the drug in a drug-free state for a prolonged period of time, usually 15 min in rats. The early training phase consists of 3 days of FR1 training, 2 h daily sessions, 30 infusions (0.25 mg cocaine/infusion) paired with contingent presentation of a drug-associated cue (a light). Once animals have acquired self-administration under continuous reinforcement, the reinforcement schedule is switched to fixed intervals, with daily increments: FI1 min, FI2 min, FI4 min, FI8 min, FI10 min, and FI15 min. After 3 days of training under the FI15 schedule (left part of the figure), contingent presentations of the CS are introduced under a FR10 schedule such that rats are now trained under a FI15(FR10:S) second-order schedule of reinforcement. This acquisition procedure provides a direct measure of the potentiation of responding during interval schedules by the contingent presentation of the CS since it is introduced only once responding under fixed interval has stabilized. Thus, although the average response rate is 50–70 during the first interval of a FI15 schedule, it reaches 150–200 when the CS is contingently presented.
Second-order schedules of cocaine and heroin self-administration were initially developed by Goldberg and colleagues in nonhuman primates to assess the influence of environmental stimuli upon drug self-administration (15, 17, 18). We have also established second-order schedules of drug reinforcement in rats (41). In the study by Arroyo and colleagues (41), rats were initially required to learn self-administration of cocaine under continuous reinforcement, that is, FR1. After stabilization of responding (5–7 daily 2-h sessions), a second-order schedule with FR components of the type FR x (FR y :S) was introduced, with initial values of x and y set to 1, so that each active lever press resulted in the presentation of the CS and the delivery of 0.25 mg of cocaine. Then x and y values were progressively increased with increments in response requirements starting with x, that is, FR5(FR1:S) and FR10(FR1:S), then y, that is, FR10(FR2:S), FR10(FR4:S), FR10(FR7:S), and FR10(FR10:S). After stabilization of responding under this FR10(FR10:S) schedule of reinforcement, which therefore requires 100 active lever presses and ten 1 s presentations of the CS to obtain a cocaine infusion, a final fixed interval schedule FI15(FR10:S) was introduced such that a cocaine infusion was delivered only following the tenth active lever press that occurred when the 15 min interval had elapsed. Finally, rats were allowed to perform cocaine seeking behavior under this schedule for 10 days. This acquisition procedure produces robust and stable CS-dependent rates of responding (41) and has been used extensively to probe the neural mechanisms involved in the acquisition and the performance of cue-controlled cocaine seeking (58–60, 89).
It is also possible to decrease the acquisition period to 11 days (90). In this case, the training phase consists of 3 days of FR1 training, 2-h daily sessions, 30 infusions (0.25 mg cocaine/infusion) followed by the introduction of interval schedules, with daily increments: FI1 min, FI2 min, FI4 min, FI8 min, FI10 min, and FI15 min. After 3 days of training under the FI15 schedule, contingent presentations of the CS are introduced under a FR10 schedule such that rats are now trained under a FI15(FR10:S) second-order schedule of reinforcement. This acquisition procedure provides a direct measure of the potentiation of responding during interval schedules by the contingent presentation of the CS since it is introduced only once the responding under fixed interval has stabilized. Thus, although the average response rate is 50–70 during the first interval of a FI15 schedule, it reaches 150–200 when the CS is contingently presented (Fig. 5), as described by Belin and Everitt in a study addressing intrastriatal mechanisms involved in habitual cocaine seeking (58). Indeed, short and long-term training under second-order schedules of reinforcement for cocaine have been very useful for investigating the neural mechanisms involved in the transition from newly acquired to well-established or habitual cue-controlled cocaine seeking.
4.3 Neurobiology of Cue-Controlled Cocaine Seeking: A Ventral to Dorsal Striatum Progression in the Locus of Control over Behavior Parallels the Development of Habitual Drug Seeking
The acquisition of cue-controlled cocaine seeking depends upon the basolateral nucleus of the amygdala (BLA) (91–93), the AcbC (89, 94), and also the orbitofrontal cortex (OFC) (95, 96) (Fig. 6a). Performance of cue-controlled cocaine seeking depends upon the VTA (97) and the interaction between the BLA and the AcbC (94) (Fig. 6). Finally, the nucleus accumbens shell mediates the dopamine-dependent potentiating effects of cocaine over cue-controlled cocaine seeking (89). When cue-controlled cocaine seeking becomes well established, or habitual, that is, after several weeks of training under a FI second-order schedule of reinforcement, contingent presentations of CSs increase extracellular dopamine concentration in the dorsolateral striatum (DLS) but not in the AcbC nor in the AcbS (59). Moreover, bilateral dopamine receptor blockade in the DLS, but not in the AcbC, selectively reduces cocaine seeking habits in rats (58, 60).
Fig. 6.
Neurobiological substrates of the acquisition, maintenance, and habitual performance of cue-controlled cocaine seeking behavior. Acquisition: The acquisition of cue-controlled cocaine seeking depends upon the BLA, the AcbC, and also the OFC. Early Performance: Performance of cue-controlled cocaine seeking depends upon the VTA and the interaction between the BLA and the AcbC. Finally, the AcbS mediates the dopamine-dependent potentiating effects of cocaine over cue-controlled cocaine seeking. Habitual Performance: When cue-controlled cocaine seeking becomes well established, or habitual, contingent presentations of CSs increase extracellular dopamine concentration in the dorsolateral striatum (DLS) but not in the AcbC or AcbS. Moreover, bilateral dopamine receptor blockade in the DLS selectively reduces cocaine seeking habits in rats. Thus, between the acquisition and the subsequent performance, or maintenance, of cue-controlled cocaine seeking there is an apparent shift in the locus of control from the Acb to the DLS, which, we have hypothesized, reflects the development of habitual drug seeking.
Therefore, between the acquisition and the subsequent performance or maintenance of cue-controlled cocaine seeking there is an apparent shift in the locus of control from the nucleus accumbens to the dorsolateral striatum, which, we have hypothesized, reflects the development of habitual drug seeking (56) (Fig. 6). We have established that this progressive ventral to dorsal striatum shift depends upon the intra-striatal and serial dopamine-dependent connectivity, linking the AcbC to the DLS both in nonhuman primates (98) and in rats (99), that has been proposed to be an anatomical substrate for integrative mechanisms linking incentive motivation to cognitive processes (98, 100).We have recently demonstrated that disconnecting the AcbC and its regulation of dopamine transmission in the DLS impairs habitual cue-controlled cocaine seeking to the same extent as bilateral dopamine receptor blockade in the DLS alone (58) (Fig. 7). This asymmetric manipulation does not impair general operant responding when instrumental performance for either a natural reward or cocaine is still under A–O control (Belin D., Besson M. and Everitt B.J., unpublished observations). On this evidence, we speculate that after extended training under the second-order of cocaine reinforcement, cocaine seeking becomes established as an incentive habit whereby the Pavlovian incentive influences exerted by the BLA over the AcbC, in turn, enhance the powerful dorsolateral striatal dopamine-dependent habit system (Fig. 8).
Fig. 7.
Probing neural substrates of habitual cue-controlled cocaine seeking behavior: involvement of the DLS and its serial connection with the AcbC (After (51). With kind permission). (a) Schematic of the striatum in the rat (Modified from (58). With kind permission). The striato-nigro-striatal dopamine-dependent ascending circuitry is illustrated as the alternation of grey and black arrows from the ventral to the more dorsal parts of the circuit, that is, from the AcbS to the AcbC via the ventral tegmental area and from the AcbC, via the substantia nigra to the dorsal striatum. (b) Cocaine seeking is dose-dependently impaired by bilateral infusions of the DA receptor antagonist α-flupenthixol (depicted as dots) into the DL striatum. α-flupenthixol infusions into the DL striatum dose-dependently decreased responding on the active lever under a second-order schedule of cocaine reinforcement, but had no effect on responding on the inactive lever (58). (c) Disconnecting the AcbC from the dopaminergic innervation of the dorsal striatum impaired habitual cocaine seeking. In unilateral AcbC-lesioned rats, the AcbC relay of the loop is lost on one side of the brain. However, on the non-lesioned side, the spiraling circuitry is intact and functional. When α-flupenthixol is infused in the DLS contralateral to the lesion, it blocks the DAergic innervation from the midbrain impairing the output structure of the spiraling circuitry on the non-lesioned side of the brain. Therefore, this asymmetric manipulation disconnects the AcbC from the DLS bilaterally and greatly diminishes cocaine seeking (Adapted from (58). With kind permission).
Fig. 8.
Drug addiction conceptualized as a loss of executive top-down inhibitory control over an incentive habit. Exposure to addictive drugs triggers neurobiological, and hence, functional modifications, in neural networks involved in implicit subcortical, and declarative cortical, mechanisms. At the subcortical level, addictive drugs alter Pavlovian and instrumental learning mechanisms: they enhance the Pavlovian incentive influences from the BLA on the AcbC and alter the Pavlovian incentive processing between the BLA and the OFC thereby leading to increased incentive salience of drugs and environmental stimuli associated with them. Moreover, addictive drugs facilitate the instantiation of habitual responding, whereby drug seeking behavior is no longer under the direct control of the motivational properties of the drug itself, but instead is governed by stimuli in the environment. The development of habitual drug seeking and drug taking behavior may be related to a ventral to dorsal striatal shift in the locus of control over behavior dependent upon ascending, dopamine-dependent circuitry linking the ventral to the dorsal striatum via recurrent connections with the dopaminergic neurons of the ventral midbrain. Thus, maladaptive Pavlovian incentive processes that control “drug-oriented incentive impulses” in the AcbC are eventually channeled to the dorsal striatum-dependent habit system, thereby resulting in the emergence of incentive habits, which facilitate repetitive inflexible drug seeking and drug taking behavior. Nevertheless, incentive habits cannot account for the development of compulsive drug taking behavior, which, instead, may arise from the interaction between implicit subcortical mechanisms that tend to drive the addict toward drugs and drug-associated stimuli and declarative cortical mechanisms. Indeed, exposure to addictive drugs alters prefrontal cortical function, whereby top-down executive control over behavior is impaired. Drug addicts and drug exposed animals display cognitive inflexibility, impared decision-making processes and high rates of impulsivity, suggesting impairment of prefrontal cortical function. Thus, once incentive habits develop and interact with impaired prefrontal executive function, drug use becomes compulsive.
Although incentive habits play an important role in the pathophysiology of drug addiction, they do not account for the different behavioral aspects of the pathology, and especially compulsive drug use, that is, maintained drug use despite adverse consequences, which is a hallmark of drug addiction (see Table 1). Only recently have preclinical models of compulsive drug self-administration been developed, based on the premise that compulsive drug seeking or taking can be operationalized as persistent instrumental responding despite aversive consequences such as punishment and that it only emerges after extended access to the drug.
5 Compulsive Drug Self-Administration: When Punishment Fails to Prevent Drug Seeking and Taking
5.1 Animal Models of Compulsive Drug Seeking and Drug Taking
As emphasized previously, addicted individuals not only consume large amounts of drugs but are also unable to repress their drug use regardless of its consequences. Thus, addiction shares common features with other compulsive disorders that are characterized by the uncontrollable and irresistible urge to performance of an act, often to relieve anxiety or stress, but regardless of the rationality of the motivation. The compulsive aspect of drug use in addicted subjects is even more obvious when similarities between addiction and obsessive compulsive disorder (OCD) are considered. Indeed, compulsive behavior in the 4th version of the DSM (31) as a criterion for OCD is defined by the repetitive behaviors or mental acts that the person feels driven to perform in response to an obsession, or according to rules that must be applied rigidly aimed at preventing or reducing distress or some dreaded event or situation, but are either not connected to the issue or are excessive. Similarities between addiction and OCD have led, based on a modified version of the Yale–Brown Obsessive Compulsive Scale (Y-BOCS-hd; (101)), to the development of the Obsessive Compulsive Drinking Scale (OCDS), a self-rated questionnaire that is able to discriminate accurately between alcoholic outpatients and social drinkers with high sensitivity and specificity (102), suggesting that obsessionality and compulsivity are key features of the heavily addicted individual (102).
The inability to inhibit prepotent responses observed in compulsive disorders is commonly associated with perseverative responding regardless of negative feedback. Everitt and Robbins (2005) have suggested that this reflects a state of “must do!,” that is, specific behavioral responses must be repeated – although this subjective response could arise post hoc as a rationalization of “out-of control” habitual behavior rather than being the driving influence (55).
Signal attenuation is a theory driven model of obsessive compulsive disorder where perseverative responding is induced by simulating a deficit in feedback sensitivity (for review (103)). Subjects trained to lever press for food are given additionally an external stimulus feedback for the response. On the test day, the deficiency in response feedback is simulated by extinguishing the contingency between the response and the stimulus. Similarly, numerous preclinical models of addiction and relapse are often based on extinction procedures.
In addition to the frequent assessment of performance under extinction as an index of motivation, reinstatement procedures have been widely employed to study factors involved in relapse to drug seeking behavior (104). In this model, extinguished drug-reinforced behavior normally resumes after noncontingent priming injections of the drug, re-exposure to drug-paired cues, or exposure to stressors. Concordance between the events that induce reinstatement in laboratory animals and those that provoke relapse in humans confers predictive validity to this model.
Extinction-based procedures in rodents have been proposed to mimic drug cessation, or abstinence, consequent on the lack of drug availability. However, it is far from clear that instrumental extinction has either face or ecological validity as a model of abstinence, since addicts never, or rarely, undergo extinction of their instrumental drug taking responses, such as i.v. drug preparation and injection. Drug addicts might be confronted for different reasons by the temporary restriction of drug availability, but they commonly resume drug use as soon as drugs become available again. On the other hand, extinction/reinstatement models do have direct relevance for behaviorally based treatments if they focus instead on eliminating the conditioned effects of drug-related stimuli by presenting them in the absence of the drug (105). Cue exposure therapies, initially developed to treat phobic neurosis, have been applied to the treatment of drug addiction on the understanding that disrupting the relationships between the drug and environmental stimuli associated with it may have beneficial effects. However, unlike phobic neurosis (106, 107), extinction treatment trials have not yet proven to be effective to treat heroin and nicotine dependence (108, 109).
Moreover in rats, a newly acquired instrumental response supported only by the conditioned reinforcing properties of stimuli previously paired with either cocaine, heroin, or sucrose can persist in the complete absence of the primary reinforcer over months of repeated, intermittent testing (110). Thus, it is possible that conditioned reinforcers, through their acquired reinforcing properties independent of the mental representation of the primary reinforcer, may control habitual instrumental responses, which are resistant to extinction. In addition, a deficit in extinction of Pavlovian associations produced by repeated drug exposure (111) may cause relatively long-lasting impairments in the control of behavior and thus facilitate the compulsive features of conditioned stimulus-maintained drug seeking. In summary, although persistent responding under extinction may provide a model of compulsivity, we suggest that it is perhaps more relevant to assess compulsivity as the altered responsiveness to adverse, instead of omitted, reinforcement. Moreover, it remains unclear whether the omission of a reinforcer, as in extinction procedures, and presentation of an aversive stimulus, as in the punishment procedure, are equivalent in terms of the psychological states they engender. Hence, it should not be assumed that there are commonalities in the underlying mechanisms when persistent responding is established using these two methodologies.
Limitations of extinction procedures in the context of drug addiction have led to the development of new animal models that reflect more ecologically valid influences in drug addicts. Clinical data on abstinence from cocaine use suggest that the negative consequences directly related to use are a major reason for cessation (112). Indeed, drug use is a high-risk behavior as it often compromises health, work, and social relationships (113–116). Preclinical models of drug addiction might therefore attempt to resemble in several respects the human conditions of compulsivity and fulfill some important features of the pathology in order to meet the necessary requirements of construct, face, and predictive validities essential for the clinical application of data obtained from animal studies (117, 118). Of course, in animals it is extremely difficult to exactly reproduce compulsive drug seeking and taking as seen in human drug addicts because of obvious limitations including the absence of direct personal costs such as family or society problems associated with drug abuse, or limited alternative reinforcement choices. However, despite such limitations, compulsivity in preclinical models of drug addiction should and must be defined as an inability to cease drug seeking and taking under conditions in which the drug is constantly available but its obtainment is associated with adverse consequences.
In recent years, progress has been made in an attempt to mimic human conditions of compulsive drug use. Since aversive consequences can originate from either the drug effect itself, the stimuli associated with drug use or the response for the drug, we will describe in detail how these features have been integrated in animal models of compulsive drug seeking and taking, referring to the potential advantages or disadvantages that each may present.
“Must do” despite adverse consequences
“Must do” despite devalued consequences< div class='tao-gold-member'>Only gold members can continue reading. Log In or Register a > to continue