Showing posts sorted by relevance for query sepsis fail. Sort by date Show all posts
Showing posts sorted by relevance for query sepsis fail. Sort by date Show all posts

Sunday, March 24, 2013

Why Most Clinical Trials Fail: The Case of Eritoran and Immunomodulatory Therapies for Sepsis

The experimenter's view of the trees.
The ACCESS trial of eritoran in the March 20, 2013 issue of JAMA can serve as a springboard to consider why every biological and immunomodulatory therapy for sepsis has failed during the last 30 years.  Why, in spite of extensive efforts spanning several decades have we failed to find a therapy that favorably influences the course of sepsis?  More generally, why do most clinical trials, when free from bias, fail to show benefit of the therapies tested?

For a therapeutic agent to improve outcomes in a given disease, say sepsis, a fundamental and paramount precondition must be met:  the agent/therapy must interfere with part of the causal pathway to the outcome of interest.  Even if this precondition is met, the agent may not influence the outcome favorably for several reasons:
  • Causal pathway redundancy:  redundancy in causal pathways may mitigate the agent's effects on the downstream outcome of interest - blocking one intermediary fails because another pathway remains active
  • Causal factor redundancy:  the factor affected by the agent has both beneficial and untoward effects in different causal pathways - that is, the agent's toxic effects may outweigh/counteract its beneficial ones through different pathways
  • Time dependency of the causal pathway:  the agent interferes with a factor in the causal pathway that is time dependent and thus the timing of administration is crucial for expression of the agent's effects
  • Multiplicity of agent effects:  the agent has multiple effects on multiple pathways - e.g., HMG-CoA reductase inhibitors both lower LDL cholesterol and have anti-inflammatory effects.  In this case, the agent may influence the outcome favorably, but it's a trick of nature - it's doing so via a different mechanism than the one you think it is.

Saturday, October 12, 2013

Goldilocks Meets Walter White in the ICU: Finding the Temperature (for Sepsis and Meningitis) that's Just Right

In the Point/Counterpoint  section of the October issue of Chest, two pairs of authors spar over whether fever should be controlled in sepsis by either pharmacological or external means.  Readers of this blog may recall this post wherein I critically appraised the Schortgen article on external cooling in septic shock that was in AJRCCM last year.  Apparently that article made a more favorable impression on some practitioners than it did on me, as the proponents of cooling in the Chest piece hang their hats on this article (and their ability to apply physiological principles to medical therapeutics).  (My gripes with the Schortgen study were many, including a primary endpoint that was of little value, cherrypicking the timing of the secondary mortality endpoint, and the lack of any biological precedent for manipulation of body temperature improving mortality in any disease.)

Reading the Point and Counterpoint piece (in addition to an online first article in JAMA describing a trial of induced hypothermia in severe bacterial meningitis - more on that later) allowed me to synthesize some ideas about the epistemology (and psychology) of medical evidence and its evaluation that I have been tossing about in my head for a while.  Both the proponent pair and the opponent pair of authors give some background physiological reasoning as to why fever may be, by turns, beneficial and detrimental in sepsis.  The difference, and I think this is typical, is that the proponents of fever reduction:  a.) seem much more smitten by their presumed understanding of the underlying physiology of sepsis and the febrile response; b.) focus more on minutiae of that physiology; c.) fail to temper their faith in application of physiological principles with the empirical data; and d.) grope for subtle signals in the empirical data that appear to rescue the sinking hypothesis.

Friday, February 10, 2017

The Normalization Fallacy: Why Much of “Critical Care” May Be Neither

Like many starry-eyed medical students, I was drawn to critical care because of the high stakes, its physiological underpinnings, and the apparent fact that you could take control of that physiology and make it serve your goals for the patient.  On my first MICU rotation in 1997, I was so swept away by critical care that I voluntarily stayed on through the Christmas holiday and signed up for another elective MICU rotation at the end of my 4th year.  On the last night of that first rotation, wistful about leaving, I sauntered through the unit a few times thinking how I would miss the smell of the MICU and the distinctive noise of the Puritan Bennett 7200s delivering their [too high] tidal volumes.  By then I could even tell you whether the patient’s peak pressures were high (they often were) by the sound the 7200 made after the exhalation valve released.  I was hooked, irretrievably. 

I still love thinking about physiology, especially in the context of critical illness, but I find that I have grown circumspect about its manipulation as I have reflected on the developments in our field over the past 20 years.  Most – if not all – of these “developments” show us that we were harming patients with a lot of the things we were doing.  Underlying many now-abandoned therapies was a presumption that our understanding of physiology was sufficient that we could manipulate it to beneficial ends.  This presumption hints at an underlying set of hypotheses that we have which guide our thinking in subtle but profound and pervasive ways.  Several years ago we coined the term the “normalization heuristic” (we should have called it the “normalization fallacy”) to describe our tendency to view abnormal laboratory values and physiological parameters as targets for normalization.  This approach is almost reflexive for many values and parameters but on closer reflection it is based on a pivotal assumption:  that the targets for normalization are causally related to bad outcomes rather than just associations or even adaptations.

Saturday, September 25, 2010

In the same vein: Intercessory Prayer for Heart Surgery and Neuromuscular Blockers for ARDS

Several years back, in the American Heart Journal, was published a now-widely referenced study of intercessory prayer to aid recovery of patients who had had open heart surgery (see: Am Heart J. 2006 Apr;151(4):934-42). This study was amusing for several reasons, not least of which because, in spite of being funded by a religious organization, the results were "negative" meaning that there was no apparent positive effect of prayer. Of course, the "true believers" called foul, claiming that the design was flawed, etc. (Another ironic twist of the study: patients who knew they were being prayed for actually fared worse than those who had received no prayers.)

The most remarkable thing about this study for me is that it was scientifically irresponsible to conduct it. Science (and biomedical research) must be guided by testing a defensible hypothesis, based on logic, historical and preliminary data, and, in the case of biomedical research, an understanding of the underlying pathophysiology of the disease process under study. Where there is no scientifically valid reason to believe that a therapy might work, no preliminary data - nothing - a hypothesis based on hope or faith has no defensible justification in biomedical research, and its study is arguably unethical.

Moreover, a clinical trial is in essence a diagnostic test of a hypothesis, and the posterior probability of a hypothesis (null or alternative) depends not only on the frequentist data produced by the trial, but also on a Bayesian analysis incorporating the prior probability that the alternative (or null) hypothesis is true (or false). That is, if I conducted a trial of orange juice (OJ) for the treatment of sepsis (another unethical design) and OJ appeared to reduce sepsis mortality by, say, 10% with P=0.03, you should be suspicious. With no biologically plausible reason to believe that OJ might be efficacious, the prior probability of Ha (that OJ is effective) is very low, and a P-value of 0.03 (or even 0.001) is unconvincing. That is, the less compelling the general idea supporting the hypothesis is, the more robust a P-value you should require to be convinced by the data from the trial.

Thus, a trial wherein the alternative hypothesis tested has a negligible probability of being true is uninformative and therefore unethical to conduct. In a trial such as the intercessory prayer trial, there is NO resultant P-value which is sufficient to convince us that the therapy is effective - in effect, all statistically significant results represent Type I errors, and the trial is useless.
(I should take a moment here to state that, ideally, the probability of Ho and Ha should both be around 50%, or not far off, representing true equipoise about the scenario being studied. Based on our data in the Delta Inflation article (see: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2887200/ ), it appears that at least in critical care trials evaluating comparative mortality, the prior probability of Ha is on the order of 18%, and even that figure is probably inflated because many of the trials that comprise it represent Type I errors. In any case, it is useful to consider the prior probability of Ha before considering the data from a trial, because that prior is informative. [And in the case of trials for biologics for the treatment of sepsis {be it OJ or drotrecogin, or anti-TNF-alpha}, the prior probability that any of them is efficacious is almost negligibly low.)

Which segues me to Neuromuscular Blockers (NMBs) for ARDS (see: http://www.nejm.org/doi/full/10.1056/NEJMoa1005372 ) - while I have several problems with this article, my most grievous concern is that we have no (in my estimation) substantive reason to believe that NMBs will improve mortality in ARDS. They may improve oxygenation, but we long ago abandoned the notion that oxygenation is a valid surrogate end-point in the management of ARDS. Indeed, the widespread abandonment of use of NMBs in ARDS reflects consensus agreement among practitioners that NMBs are on balance harmful. (Note in Figure 1 that, in contrast to the contention of the authors in the introduction that NMBs remain widely used, only 4.3% of patients were excluded because of use of NMBs at baseline.)

In short, these data fail to convince me that I should be using NMBs in ARDS. But many readers will want to know "then why was the study positive?" And I think the answer is staring us right in the face. In addition to the possibility of a simple Type I error, and the fact that the analysis was done with a Cox regression, controlling for baseline imbalances (even ones such as PF ratio which were NOT prospectively defined as variables to control for in the analysis), the study was effectively unblinded/unmasked. It is simply not possible to mask the use of NMBs, the clinicians and RNs will quickly figure out who is and is not paralyzed - paralyzed patients will "ride the vent" while unparalyzed ones will "fight the vent". And differences in care may/will arise.

It is the simplest explanation, and I wager it's correct. I will welcome data from other trials if they become available (should it even be studied further?), but in the meantime I don't think we should be giving NMBs to patients with ARDS any more than we should be praying (or avoiding prayer) for the recovery of open-heart patients.

Friday, May 31, 2013

Over Easy? Trials of Prone Positioning in ARDS

Published May 20 in the  NEJM to coincide with the ATS meeting is the (latest) Guerin et al study of Prone Positioning in ARDS.  The editorialist was impressed.  He thinks that we should start proning patients similar to those in the study.  Indeed, the study results are impressive:  a 16.8% absolute reduction in mortality between the study groups with a corresponding P-value of less than 0.001.  But before we switch our tastes from sunny side up to over easy (or in some cases, over hard - referred to as the "turn of death" in ICU vernacular) we should consider some general principles as well as about a decade of other studies of prone positioning in ARDS.

First, a general principle:  regression to the mean.  Few, if any, therapies in critical care (or in medicine in general) confer a mortality benefit this large.  I refer the reader (again) to our study of delta inflation which tabulated over 30 critical care trials in the top 5 medical journals over 10 years and showed that few critical care trials show mortality deltas (absolute mortality differences) greater than 10%.   Almost all those that do are later refuted.  Indeed it was our conclusion that searching for deltas greater than or equal to 10% is akin to a fool's errand, so unlikely is the probability of finding such a difference.  Jimmy T. Sylvester, my attending at JHH in late 2001 had already recognized this.  When the now infamous sentinel trail of intensive insulin therapy (IIT) was published, we discussed it at our ICU pre-rounds lecture and he said something like "Either these data are faked, or this is revolutionary."  We now know that there was no revolution (although many ICUs continue to practice as if there had been one).  He could have just as easily said that this is an anomaly that will regress to the mean, that there is inherent bias in this study, or that "trials stopped early for benefit...."

Saturday, October 11, 2014

Enrolling Bad Patients After Good: Sunk Cost Bias and the Meta-Analytic Futility Stopping Rule

Four (relatively) large critical care randomized controlled trials were published early in the NEJM in the last week.  I was excited to blog on them, but then I realized they're all four old news, so there's nothing to blog about.  But alas, the fact that there is no news is the news.

In the last week, we "learned" that more transfusion is not helpful in septic shock, that EGDT (the ARISE trial) is not beneficial in sepsis, that simvastatin (HARP-2 trial) is not beneficial in ARDS, and that parental administration of nutrition is not superior to enteral administration in critical illness.  Any of that sound familiar?

I read the first two articles, then discovered the last two and I said to myself "I'm not reading these."  At first I felt bad about this decision, but then that I realized it is a rational one.  Here's why.

Monday, December 31, 2007

Is there any place for the f/Vt (the Yang-Tobin index) in today's ICU?

Recently, Tobin and Jubran performed an eloquent re-analysis of the value of “weaning predictor tests” (Crit Care Med 2008; 36: 1). In an accompanying editorial, Dr. MacIntyre does an admirable job of disputing some of the authors’ contentions (Crit Care Med 2008; 36: 329). However, I suspect space limited his ability to defend the recommendations of the guidelines for weaning and discontinuation of ventilatory support.

Tobin and Jubran provide a whirlwind tour of the limitations of meta-analyses. These are important considerations when interpreting the reported results. However, lost in this critique of the presumed approach used by the McMaster group and the joint tack force are the limitations of the studies on which the meta-analysis was based. Tobin and Jubran provide excellent points about systematic error limiting the internal validity of the study but, interestingly, do not apply such criticism to studies of f/Vt.

For the sake of simplicity, I will limit my discussion to the original report by Yang and Tobin (New Eng J Med 1991; 324: 1445). As a reminder, this was a single center study which included 36 subjects in a “training set” and 64 subjects in a “prospective-validation set.” Patients were selected if “clinically stable and whose primary physicians considered them ready to undergo a weaning trial.” The authors then looked a variety of measures to determine predictors of those “able to sustain spontaneous breathing for ≥24 hours after extubation” versus those “in whom mechanical ventilation was reinstituted at the end of a weaning trial or who required reintubation within 24 hours.” While not explicitly stated, it looks as if all the patients who failed a weaning trial had mechanical ventilation reinstituted, rather than failing extubation.

In determining the internal validity of a diagnostic test, one important consideration is that all subjects have the “gold standard” test performed. In the case of “weaning predictor tests,” what is the condition we are trying to diagnose? I would argue that it is the presence of respiratory failure requiring continued ventilatory support. Alternatively, it is the absence of respiratory failure requiring continued ventilatory support. I would also argue that the gold standard test for this condition is the ability to sustain spontaneous breathing. Therefore, to determine the test performance of “weaning predictor tests,” all subjects should undergo a trial of spontaneous breathing regardless of the results of the predictor tests. Now, some may argue that the self-breathing trial (or spontaneous breathing trial) is, indeed, this gold standard. I would agree if SBTs were perfectly accurate in predicting removal of the endotracheal tube and spontaneous breathing without a ventilator in the room. This is, however, not the case. So, truly, what Yang and Tobin are assessing is the ability of these tests to predict the performance on a subsequent SBT.

Dr. MacIntyre argues that “since the outcome of an SBT is the outcome of interest, why waste time and effort trying to predict it?” I would agree with this within limits. Existing literature supports the use of very basic parameters (e.g., hemodynamic stability, low levels of FiO2 and PEEP, etc.) as screens for identifying patients for whom an SBT is appropriate. Uncertain is the value of daily SBTs in all patients, regardless of passing this screen or not. One might hypothesize that simplifying this step even further might provide incremental benefit. Yang and Tobin, however, must consider a failure on an SBT to have deleterious effects. They consider “weaning trials undertaken either prematurely or after an unnecessary delay…equally deleterious to a patient’s health.” There is no reference supporting this assertion. Recent data suggest that inclusion of “weaning predictor tests” do not save patients from harm due to avoiding SBTs destined to fail (Tanios et al. Crit Care Med, 2006; 34: 2530). On the contrary, inclusion of the f/Vt as the first in Tobin’s and Jubran’s “three diagnostic tests in sequence” resulted in prolonged weaning time.

Tobin and Jubran also note the importance of prior probabilities in determining the performance of a diagnostic test. In the original study, Yang and Tobin selected patients who “were considered ready to undergo a weaning trial” by their primary physicians. Other studies have reported that such clinician assessments are very unreliable with predictive values marginally better than a coin-flip (Stroetz et al, Am J Resp Crit Care Med, 1995; 152: 1034). Perhaps, the clinicians whose patients were in this study are better than this. However, we are not provided with strict clinical rules which define this candidacy for weaning but can probably presume that “readiness” is at least a 50% prior probability of success. Using Yang and Tobin’s sensitivity of 0.97 and specificity of 0.64 for f/Vt, we can generate a range of posterior probabilities of success on a weaning trial:


As one can see, the results of the f/Vt assessment have a dramatic effect on the posterior probabilities of successful SBTs. However, is there a threshold below which one would advocate not performing an SBT if one’s prior probability is 50% or higher? I doubt it. Even with a pre-test probability of successful SBT of 50% and a failed f/Vt, 1 in 25 patients would actually do well on an SBT. I am not willing to forego an SBT with such data since, in my mind, SBTs are not as dangerous as continued, unneeded mechanical ventilation. I would consider low f/Vt values as completely non-informative since they do not instruct me at all regarding the success of extubation – the outcome for which I am most interested.

Other studies have used f/Vt to predict extubation failure (rather than SBT failure) and these are nicely outlined in a recent summary by Tobin and Jubran (Intensive Care Medicine 2006; 32: 2002). Even if we ignore different cut-points of f/Vt and provide the most optimistic specificities (96% for f/Vt <100, Uusaro et al, Crit Care Med 2000; 28: 2313) and sensitivities (79% for f/VT <88, Zeggwagh et al., Intens Care Med 1999; 25:1077), the f/Vt may not help much. As with the prior table, using prior probabilities and the results of the f/Vt testing, we can generate posterior probabilities of successful extubation:


As with the predictions of SBT failure, a high f/Vt lowers the posterior probability of successful extubation greatly. However, one must consider the cut off for posterior probabilities in which one would not even attempt an SBT. Even with a 1% posterior probability, 1 in 100 patients will be successfully extubated. This is the rate when the prior probability of successful extubation is only 20% AND the patient has a high f/Vt! What rate of failed extubation is acceptable or, even, preferable? Five percent? Ten percent? If one never reintubates a patient, it is more likely that he is waiting “too long” to extubate rather than possessing perfect discrimination. Furthermore, what is the likelihood that patients with poor performance on an f/Vt will do well on an SBT? I suspect this failure will prohibit extubation and the high f/Vt values will only spare the effort of performing the SBT. Is the incremental effort of performing SBTs on those who are destined to fail such that it requires more time than the added complexity of using the f/Vt to determine if a patient should receive an SBT at all? Presuming that we require an SBT prior to extubation, low f/Vt values remain non-informative. One could argue that with a posterior probability of >95%, we should simply extubate the patient, but I doubt many would take this approach, except in those intubated for reasons not related to respiratory problems (e.g. mechanical ventilation for surgery or drug overdose).

Drs. Tobin, Jubran and Marini (who writes an additional, accompanying editorial, Crit Care Med 2008; 36: 328) are master clinicians and physiologists. When they are at the bedside, I do not doubt that their “clinical experience and firm grasp of pathophysiology” (as Dr. Marini mentions), can match or even exceed the performance of protocolized care. Indeed, expert clinicians at Johns Hopkins have demonstrated that protocolized care did not improve the performance of the clinical team (Krishnan et al., Am J Resp Crit Care Med 2004; 169: 673). I have heard Dr. Tobin argue that this indicates that protocols do not provide benefit for assessment of liberation (American Thoracic Society, 2007). I doubt that the authors would strictly agree with his interpretation of their data since several of the authors note in a separate publication that “the regularity of steps enforced by a protocol as executed by nurses or therapists trumps the rarefied individual decisions made sporadically by busy physicians” (Fessler and Brower, Crit Care Med 2005; 33: S224). What happens to the first patient who is admitted after Dr. Tobin leaves service? What if the physician assuming the care of his patients is more interested in sepsis than ventilatory physiology? What about the patient admitted to a small hospital in suburban Chicago rather than one of the Loyola hospitals? Protocols do not intend to set the ceiling on clinical decision-making and performance, but they can raise the floor.