In the present study, the MIs remained stable throughout the observation period, confirming that the course of illness also remained stable over time in this subgroup of bipolar patients receiving prophylactic lithium treatment. No association could be found between the MIs and the number of episodes before the start of lithium treatment or the latency between the onset of illness and the start of lithium treatment.
In addition to this main finding, our study differs in several respects to investigations that have demonstrated poor stability with long-term lithium treatment. Many of the previous prospective studies on lithium treatment have had relatively short observation periods (i.e., less than 2 years in duration). Indeed, only a few have had longer observation periods, i.e., extending up to 5 years (Maj et al. 1989b; Maj et al. 1998; Maj 2003) or 7 years (Vestergaard and Schou 1988). In the present study, however, data were collected over a much longer period, covering up to 20 years.
Although other studies have assessed large cohorts over long observation periods, they have not focused on the long-term stability of prophylactic lithium treatment. For example, the mood disorders center in Sardinia, a Stanley Foundation Bipolar Network research center (Post et al. 2001; Suppes et al. 2001), evaluated a large cohort of lithium patients comparable in size to the IGSLI cohort. Tondo and coworkers presented comprehensive data on the long-term course of their lithium-treated patients within the Sardinian cohort (Tondo et al. 1998; Baldessarini et al. 2000). The results of an analysis over a mean treatment period of 6 years show a substantial improvement in the course of illness during long-term lithium treatment compared to the period before lithium treatment was initiated, although complete protection against affective episodes was uncommon. However, the issue of stability over time in patients on long-term lithium treatment was not addressed in this analysis (Tondo et al. 2001). Rybakowski and coworkers analyzed the efficacy of long-term lithium treatment, comparing the pre-index period with a post-index lithium treatment period of 10 years (Rybakowski et al. 2001). The study examined whether the effectiveness of lithium treatment in patients who initiated treatment in the 1980s was lower than that observed in patients who initiated treatment in the 1970s. Although patients in the 1970s group were maintained on higher serum lithium levels, no decrease in the effectiveness of treatment was observed in the 1980s group.
Several studies have evaluated long-term outcomes in patients who began with lithium treatment but continued with various treatments other than lithium in the naturalistic setting. A recent study by Licht and coworkers found an unsatisfactory outcome after 15 years (Licht et al. 2008); however, their follow-up was based on registry data that did not contain information on whether patients had continued to receive lithium. As a result, their findings do not allow inferences on the effectiveness of long-term lithium treatment.
Our study was not concerned with the efficacy or effectiveness of lithium, both of which have been demonstrated in a substantial body of literature. The use of the MI as our outcome measure did not allow us to compare the pre-index and post-index course of illness, because the MI requires prospective follow-up to obtain valid results. Retrospective data (e.g., from patient histories) are insufficient in this regard.
The present study also differs from other investigations regarding the indication for starting lithium prophylaxis. Most studies which have been performed during the last decade included patients within a broader definition of bipolar disorder and patients with an episodic pattern of illness are systematically underrepresented (Coryell 2009; Grof et al. 1995; Goodwin 1999; Gershon et al. 2009). The lithium clinics involved in this study, however, follow the Kraepelinian tradition of diagnosing bipolar disorder. As a result, it is conceivable that most of the patients in our sample were bipolar in the traditional and narrow sense of the term.
In addition, many of the newer studies perform analyses that use time-to-new-episode or time-to-new-rehospitalization or hazard ratios for relapse as the main outcome measure for long-term prophylactic effectiveness (Bowden et al. 2003; Tohen et al. 2005; Viguera et al. 2001; Geddes et al. 2010; Suppes et al. 2009). Although this type of analysis is well suited to relatively short trials that aim at proving a single drug's efficacy, it is inappropriate for long-term maintenance studies because it fails to discriminate between different types of response. Outcome criteria such as relapse or recurrence do not afford proper assessment of the course of illness in patients who show substantial clinical improvement but still experience episodes and thus fail to consider a patient-focused perspective which is relevant for clinical practice (Murru et al. 2011). Given that bipolar disorder is characterized by wide variations in the length and severity of episodes, the MI is an outcome measure which allows different forms of response and clinical course to be distinguished from one another in a precise fashion. This can be seen in an investigation of lithium maintenance treatment over a maximum of 15 years in a small subsample of the population in the present study: although the MItotal remained stable throughout the study period, the analysis of the absolute number of recurrences failed to produce any conclusive results because of the general shift over the study period from outpatient to inpatient treatment (Berghöfer et al. 1996). The outcome measure ‘burden of illness’ which is comparable to the MI and uses a life chart method combining severity and duration of episodes has recently been presented by Backlund and coworkers in a long-term evaluation (Backlund et al. 2009 ).
In summary, the MI appears to be the most accurate approach to describing chronic illnesses and would therefore seem to be a much more appropriate tool than survival analysis. Extended Cox regression models can provide a more accurate description of chronic illnesses because they focus on multiple recurrences rather than time to first recurrence (Pfennig et al. 2010).
The results of the present analysis are in agreement with those of several studies from the same group of researchers and, in part, derived from the same patient data. Berghöfer et al. used the MI to report on long-term response in a subgroup of bipolar patients over a maximum of 15 years (Berghöfer et al. 1996), as noted above, and in another study over a maximum of 20 years (Berghöfer and Müller-Oerlinghausen 2000). In both studies, which included a subset of subjects from the present investigation, the severity and duration of recurrences remained stable, and even decreased, over the observation period, albeit in small sample sizes. Two recent reviews also support our finding that the effectiveness of lithium prophylaxis does not diminish over time (Burgess et al. 2001;Kleindienst et al. 1999).
There has been some controversy as to whether the length of time between illness onset and the start of prophylactic treatment (i.e., latency) may influence patients' response to long-term treatment (Franchini et al. 1999). For this reason, we included latency of prophylactic treatment in our analysis. However, like the present analysis, other recent studies have not shown any association between negative outcomes and latency (Baethge et al. 2003a; Baethge et al. 2003b; Baldessarini et al. 2003).
Our study has several methodological limitations. Firstly, the severity of episodes may have been rated differently at the various centers due to the use of different symptom thresholds for the initiation of treatment. This clearly has the potential to affect which symptoms were rated as degree 2. In addition, with multiple countries and cultures involved, treatment selection may have varied depending on factors such as the healthcare system, the regional facilities available, and individual patient preferences. As in any long-term investigation, patients who receive up to 20 years of treatment were seen by a large number of therapists with varying degrees of training. However, the influence of the abovementioned factors may have been mitigated by the similar tradition of diagnosis and treatment followed by all of the centers that participated in the present study. More specifically, the centers agreed on a common treatment concept that gives preference to lithium monotherapy whenever possible as a means to avoid adverse events and drug-induced cycling. In addition, there were no differences in the MI between the centers. As a result, any center-specific effect is likely to have been relatively small.
Secondly, the centers participating in the present study were specialized academic outpatient clinics that, for the most part, treated patients who required an above-average amount of care. As such, a selection bias must be assumed. It should be noted, however, that the use of additional medication in our sample was quite low. Out of 346 patients 152 (44%) had a mean co-medication period of 22.4 out of 52 weeks (see Table 2), which indicates that patients with a severe course of illness were unlikely to have been overrepresented. The use of co-medication was higher in other long-term observations (e.g., 15). Because the present study is not an epidemiological investigation with a representative sample of bipolar patients, our results cannot be extrapolated to the general population of these patients; similarly, it is not possible to fully apply our results to routine psychiatric practice.
Thirdly, this analysis did not count affective symptoms that had been rated as degree 1 (i.e., symptoms that do not require additional treatment). Recently, a substantial number of studies have been conducted to assess interepisodic subthreshold symptoms, such as cognitive or affective impairment. It seems unlikely, however, that including degree 1 symptoms in the analysis would significantly affect the long-term stability shown by the MI.
Fourthly, a substantial number of patients dropped out of the study before completing 20 years of treatment, and these subjects were not followed up. One might argue that analyzing only those patients who remained on lithium treatment caused a selection toward higher stability, because non-responders may have switched to a different long-term medication or treatment setting. However, it should be noted that the mean MI in patients who dropped out was not higher during their last year of follow-up than it had been during the preceding years. This indicates that the course of illness in subjects who left the study was no worse than those who continued lithium treatment.
As a final consideration, it should be pointed out that the MI does not fully reflect the effects and benefits of lithium in individual patients. A patient might show a higher MI than another patient during lithium treatment but might nevertheless experience a substantially greater reduction in his or her affective morbidity after starting treatment. For example, a patient with an MI of 0.125 may have spent 15 days in the hospital (degree 3) and may show no other illness burden during a period of 1 year, alternatively the patient could have received approximately 23 days of treatment in addition to lithium for any affective symptoms without having spent any time in the hospital (degree 2). To show individual benefits, data comparing pre- and post-treatment MI would have been helpful. However, assessing the initial effectiveness of lithium treatment was not the primary focus of our analysis.