Home > Uncategorized > Cognitive bias and the “number needed to treat”

Cognitive bias and the “number needed to treat”

October 12th, 2011

I posted a couple of days ago about some famous research showing that the “hot hand” in basketball is (apparently) an illusion. As one of the researchers, Amos Tversky, said:

I’ve been in a thousand arguments over this topic, won them all, but convinced no one.

The reason that story interested me was because I’d been thinking about similar issues in relation to the recent news that a U.S. medical panel has recommended against routine screening for prostate cancer for healthy men. A couple of recent articles in the New York Times have explored the rationale behind this recommendation, and why our minds are ill-equipped to weigh the pros and cons in these cases. As Daniel Levitin wrote in a review of Jerome Groopman and Pamela Hartzband’s new book “Your Medical Mind”:

Yet studies by cognitive psychologists have shown that our brains are not configured to think statistically, whether the question is how to find the best price on paper towels or whether to have back surgery. In one famous study, Amos Tversky [yes, the guy who did the "hot hand" study -AH] and Daniel Kahneman found that even doctors and statisticians made an astonishing number of inference errors in mock cases; if those cases had been real, many people would have died needlessly. The problem is that our brains overestimate the generalizability of anecdotes… The power of modern scientific method comes from random assignment of treatment conditions; some proportion of people will get better by doing nothing, and without a controlled experiment it is impossible to tell whether that homeopathic thistle tea that helped Aunt Marge is really doing anything.

I actually got into a bit of debate at (Canadian!) Thanksgiving dinner a few nights ago about prostate screening. Two of my elder relatives have had prostate cancer, and undergone the whole shebang — surgery, radiation, etc. One of them, in particular, was highly critical of the recommendation not to be screened. It had saved his life, he said. How did he know?, I asked. He just knew — and he knew dozens of other men in his survivors’ support group who had also been saved.

I feel bad about having argued over what is clearly a very emotional topic. Nonetheless, I’m ashamed to admit, I did try to explain the concept of “number needed to treat.” Surely everyone would agree that, say, taking a million men and amputating their legs wouldn’t be worthwhile if it saved one man of dying from foot cancer, right? And if you accept that, then you realize that it’s not a debate about the absolute merit of saving lives — it’s a debate about weighing the relative impact and likelihood of different outcomes. That’s captured very nicely in another NYT piece, by a professor of medicine at Dartmouth, Gilbert Welch, who compared the similarities between breast cancer screening (recommended) and prostate cancer screening (not recommended):

Overall, in breast cancer screening, for every big winner whose life is saved, there are about 5 to 15 losers who are overdiagnosed [i.e. undergo treatments such as surgery, radiation, etc.]. In prostate cancer screening, for every big winner there are about 30 to 100 losers.

So what’s the message here? Is it worth putting 100 men through surgery to extend one man’s life? How about 30 men? (The average extension of life after prostate surgery is six weeks.) Of course, there’s no “right” answer. As Welch writes, reasonable people can reach different conclusions based on the same input data. In the end, you have to assess the odds and the stakes, and choose how you want to gamble. But before you do, you should at least understand what those odds are — and that means taking a good luck at number needed to treat and other ways of assessing treatments.

  1. October 13th, 2011 at 15:48 | #1

    Great post, an important topic misunderstood by many

  2. Gerald Zavorsky
    October 13th, 2011 at 16:05 | #2

    Alex, concerning your post about breast cancer screening, which Gilbert Welch promotes (I cannot believe he actually promotes it), I would like to offer my two cents. Actually, breast cancer screening is debatable. I would consider reading an article titled “Helping Doctor and Patients Making sense of Health Statistics” (Psychological Science, 8(20) 53-96, 2008, freely accessible) in which breast cancer screening is refuted.In woman over age 50 the breast cancer mortality is 0.5% or 5 women out of 1000 if there was NO yearly breast cancer screening (Maturitas 67,5-6, 2010). If yearly screening was performed, then the the relative risk reduction is 20%. Sounds substantial. But it is not. The ABSOLUTE reduction in risk is what women should be concerned about with breast cancer screening, which is 1 per 1000 women, or 0.1%(Maturitas 67,5-6, 2010). Absolute risks foster greater insight than relative risks do. Thus the data suggests that breast cancer screening is not beneficial to women. Contrast that to the harms of taking a breast biopsy. About 50 to 200 women without breast cancer will undergo unnecessary biopsies and anxieties due to false positives for these women screening is of no benefit and negatively affects quality of life (Maturitas 67,5-6, 2010). Thus at a conservative estimate, only 54 out of 1000 women (5.4%) who have a biopsy test will actually die from breast cancer. This is called the positive predictive value or precision rate. To get the precision rate you add 4/1000 (mortality) and 50/1000 (false positives) to get 54/1000 (5.4%). 5.4% is a small precision rate for detecting breast cancer mortality from biopsies. Once women and physicians understand how to interpret health statistics, using breast cancer screening as an example, a better informed decision can be made.

  3. Phil Koop
    October 13th, 2011 at 19:28 | #3

    I don’t understand Gerald’s arithmetic. 54/1000 (5.4%) is the the number of women who will have biopsies if screening is implemented, not the number of those biopsied who will die of cancer (presumably there should be a “per year” attached to all these numbers?) The positive predictive value is 4/54, or about 7.4%.

  1. No trackbacks yet.