Monday, May 20, 2013

It All Hinges on the Premises: Prophylactic Platelet Transfusion in Hematologic Malignancy

A quick update before I proceed with the current post:  The Institute of Medicine has met and they agree with me that sodium restriction is for the birds.  (Click here for a New York Times summary article.)  In other news, the oh-so-natural Omega-3 fatty acid panacea did not improve cardiovascular outcomes as reported in the NEJM on May 9th, 2013.

An article by the TOPPS investigators in the May 9th NEJM is very useful to remind us not to believe everything we read, to always check our premises, and that some data are so dependent on the perspective from which they're interpreted or the method or stipulations of analysis that they can be used to support just about any viewpoint.

The authors sought to determine if a strategy of withholding prophylactic platelet transfusions for platelet counts below 10,000 in patients with hematologic malignancy was non-inferior to giving prophylactic platelet transfusions.  I like this idea, because I like "less is more" and I think the body is basically antifragile.  But non-inferior how?  And what do we mean by non-inferior in this trial?

In this trial the outcome is bleeding of Grade 2 or higher on the WHO grading system for bleeding events.  You can click here for the definition of Grade 2 bleeding inSupplementary Appendix 3.  Scroll to page 13.  What if only Grade 2 bleeding were improved in a trial of prophylactic platelet transfusion?  Is a reduction in bruising and vaginal bleeding justification for prophylactic platelets?  You be the judge, because it's a judgment call for sure.  (Grades 3 and 4 are more unequivocal.)  There's another problem:  Do you consider bruises greater than 2 cm to be on par with retinal hemorrhage with visual impairment?  I don't, but the WHO does, and they're both Grade 2 according to WHO.  So we have uncovered a potentially flawed premise: namely that the use of the WHO bleeding grading system is a good idea for the design of these trials.  This is the problem with a very many trials, that their design is inherently flawed in a way that severely limits their interpretation and thus their utility.

If you're skeptical that prevention of a Grade 2 bleed justifies a platelet transfusion, your next stop is Table 2 in the article on the auspicious page 1777.  Here we see that there were 18 fewer Grade 2 bleeds in the prophylaxis group and 5 fewer Grades 3 and 4 bleeds in the secondary analyses section.  (I'll return to the primary analysis in a moment.)  If you go back to the Supplementary Appendix and scroll to Table S1 on page 17, you will see that there were 504 transfusions in the no-prophylaxis group and 894 in the prophylaxis group, for a difference of 390 transfusions.  The NNT (Number Needed to Transfuse :-) is therefore approximately 17 transfusions to prevent one bleeding event, and 18 out of 23 times the bleeding event is Grade 2.  (I'm taking some liberty here.  Technically the NNT of the trial as a whole is 1/0.084 or about 12.  [Or, 1/0.07=14 if you use raw unadjusted data.]  But since patients received multiple transfusions, I'm trying to hone in on the effect of one transfusion.)

There's more.  Note that the methods section, under statistical analyses, reveals that a 90% confidence interval (CI) was chosen for the analyses.  A 90% CI is narrower than a 95% CI, making it more likely that you will have a statistically significant result.  That's the same as saying that your p-value for significance is 0.10, and moving on without any further ado.  Here, for reasons inscrutable, the editors (and the editorialist) let it pass without further mention.  There's more still.  The margin of non-inferiority (delta) was chosen to be 10% as is [sadly] the custom [well, we have 10 fingers!], and also was not justified, as is likewise [and sadly] customary.  But lo!  After the first interim analysis showed that the event rate was 48% rather than the hypothesized 20% (which means that the effect size would be smaller, requiring a larger sample size), instead of increasing the sample size, the authors elected to increase the delta margin to 15%!  This is premeditated delta inflation in the first degree!   Note also that the usual justification for using a 5% one-sided significance level (that is, a two sided p-value of 0.10) is that "we don't care if the observed difference goes in the direction opposite that which we hypothesized" is bankrupt here - the results of the trial did indeed go in the direction opposite that which was hypothesized.  A final note is that the trial was analyzed, according to a priori plans, as intention to treat.  That's great for a superiority trial, but it is against the CONSORT recommendation that non-inferiority trials be analyzed per-protocol, the more conservative method - nevermind whether post hoc analyses showed that both analyses produced similar results (refer, yet again, to the Supplementary Appendix).

When I try to reduplicate the analyses with STATA, I am befuddled, perhaps because of the authors' adjustment "for diagnosis and study treatment as minimization variables".  I do not have the actual raw data so I cannot perform the adjusted analyses.  But  STATA tells me that, based on the data presented in Table 2 under primary end points,  the difference between the groups is 7% and the 90% CI is 0.3% to 13.7%.  This is a statistically significant result in favor of prophylaxis with an alpha of 0.10, but of course the upper bound of the CI crosses the initial 10% pre-specified margin of non-inferiority (delta) so CONSORT would consider it "inconclusive" for that delta.  But the upper bound falls below the 15% delta that was modified after the interim analysis, which would make no-prophylaxis non-inferior with these assumptions.   The raw unadjusted 95% CI would be -1.0% to +14.96% so the result is non-inferior with a 15% delta and a 95% CI.  Note also that, for superiority, prophylaxis is superior with a 90% CI (the CI does not include zero), but not with a 95% CI (the CI does include zero).

And what the heck does all this mean?  Perhaps a bullet point summary is in order to sort it all out in digested form:
  • The interpretation of the trial hinges critically on the premises of the study and the endpoint chosen:  does prevention of WHO Grade 2 (or greater) bleeding make sense to you?
  • If you accept the primary endpoint as valid and relevant to practice, does a margin of non-inferiority (delta) of 10% or 15% satisfy you?
  • Are you OK with 90% confidence intervals and adjusted analyses and intention-to-treat analyses in a non-inferiority trial?  
  • Are you willing to declare superiority under less stringent criteria than you would have declared non-inferiority?  (See this Letter to the Editor of JAMA regarding the recent CONSORT statement revision.)

In the end, this trial reminds me my conclusion for the Atleplase study for submassive pulmonary embolism:  TPA now saves you from TPA later.  Do platelet transfusions now just save you from platelet transfusions later?  Perhaps a compromise is to sit tight with thrombocytopenia in hematologic malignancy, and transfuse when bleeding occurs.  Those with a propensity to bleed will declare themselves with sentinel WHO Grade 2 bleeding.  Do I have data to support this approach?  I don't.  Do you have data to support another?

1 comment: