An INTERVIEW with Dr. Jonathan Samet
ESI Special Topics,
Citing URL - http://www.esi-topics.com/airpoll/interviews/JonathanSamet.html
the interview below, Special Topics correspondent Gary Taubes
talks with Dr. Jonathan M. Samet about his highly cited work
in air pollution research. According to our recent analysis,
Dr. Samet’s work ranks at #4 among scientists publishing on
air pollution in the past decade, with 28 papers cited a total
of 1,255 times. In the ISI
Web product, Dr. Samet’s work includes 89 papers cited a
total of 3,070 times to date in the field of Clinical
Medicine. Dr. Samet is Professor and Chairman of the
Department of Epidemiology in the Bloomberg School of Public
Health at Johns Hopkins University in Baltimore, Maryland.
was it that prompted you to start studying particulate air pollution?
Give us the context of the research that led to your highly cited 2000
article in the New England Journal of Medicine (Samet, JM, et
al., "Fine particulate air pollution and mortality in 20 US
cities, 1987-1994," 343:1742-9, 2000).
I’ve worked on air pollution for a long time, going back to the
days of my post-doctoral fellowship in the late 1970s, and I had
worked on many, many pollutants. I came to Johns Hopkins in 1994,
and one of the first projects I became involved in was reexamining
some time-series studies that were coming out back then. At the
time, there were a number of these new time-series studies, some of
them from the Harvard group of Joel Schwartz and Doug Dockery,
suggesting that there was a day-to-day effect of pollution on
mortality. There was a lot of skepticism about these studies; people
did not quite have confidence in these methods. So Scott Zeger and I
became involved through the Health Effects Institute. Basically, we
were looking at some of that evidence. We built some of the
databases to evaluate them, and redid some of the analyses. Scott is
a wonderful methodologist, and together we began to explore some of
the issues related to these time-series analyses.
do you mean by a time-series analysis in this context?
“…the idea was to analyze the data in the same way in each city, look at the day-to-day evidence of air pollution and mortality, and then join the evidence across cities.”
The basic idea is to take the number of deaths—say, day to day—or
the number of hospital admissions or emergency room visits and look
to see if there’s a relationship in that time series and
variations in air pollution levels. You also take into account the
other time varying factors, like temperature or seasonal disease
outbreaks like influenza. So you use mathematical models to look at
the relationship between air pollution and mortality. I had done
some time-series work way back in a paper I published around 1981.
In the 1990s there was an increasing sophistication of methods, new
analytical tools, and the development of computer capacity that made
things possible that we couldn’t do before.
what was the motivation for the 2000 study itself on particulate air
pollution and mortality?
The 2000 study grew out of the work Scott and I had been doing.
Up until then, people would use a particular lengthy series of data,
or perhaps the data for wherever they happened to live. Our
"big idea" was to use cities without any selection; take
the largest cities or all those with pollution data, and analyze
those data in exactly the same way within each city. Until that
point, one group had done some multi-city studies in Europe. Their
model was to go where the data and investigators were available. In
the U.S., we could essentially take every city with data available
and put it together. The power of the method was that we would be
using large bodies of data, and we could try to optimize the
signal-to-noise ratio. We could also look across the country and see
whether the effect of the pollution varies. For example, in the
Northeast, people are concerned with power plants. In California,
they’re concerned about vehicle traffic. There is some variation
across the country in what people breathe. We thought this method
would allow us to better understand this heterogeneity.
seems like a natural way to explore the science. Why hadn’t this
approach been taken before?
There were a couple of things that made this possible. One was
the availability of software and hardware that allowed us to do
this. The kind of things we did in this New England Journal of
Medicine paper were simply not possible even 10 years earlier.
And my colleagues are just superb methodologists: Scott Zeger, as I’ve
mentioned, and Francesca Dominici. They developed the regression
models at the heart of this research. And so the idea was to analyze
the data in the same way in each city, look at the day-to-day
evidence of air pollution and mortality, and then join the evidence
across cities. Is it the same? Or is it different? And, if it’s
different, can we explain the evidence?
what did you find?
That paper used data from the 20 largest cities in the U.S.,
covering roughly 54 million people. It described a statistically
significant and, I think, a fact of important public health
magnitude in terms of an effect of particulate air pollution on
mortality. And we showed that the effect persisted when we took
account of other pollutants that might have been correlated with
you looking at all-cause mortality? Or specific diseases?
We looked at all-cause mortality and at cardio-respiratory
mortality, which included things like chronic obstructive pulmonary
disease, which is what we used to call emphysema, and then
pneumonia, heart attack, and congestive heart failure.
you surprised at the how influential the paper has turned out to be?
Not really. I think there are a couple of reasons the paper had
such an impact. One is the demonstration that it was possible to
carry out this kind of analysis. The second is that it showed that
there was an effect of particles on mortality that could not be
attributed to other pollutants. And I think, probably, another
important thing with this paper is that it reduced huge amounts of
data down to a couple of simple graphs and numbers, which is
something that is useful for policy apparatus and policy making. By
putting together data for such a large number of cities, it provided
a very powerful piece of evidence for decision-making. I think that
added to the significance of the paper.
you were to play devil’s advocate for a moment, what would you say
were the weak points in your paper? What are the most likely ways that
the evidence might have misled you?
Part of the story, which you might not be aware of is, that we
actually had to correct the evidence because of a software issue. I
remember saying to my colleagues, "Well, we found what
everybody else has; maybe we’re all right, maybe we’re all
wrong." We used the same software everybody else did. This was
the standard statistical package. The code was written by a superb
statistician, but the way these models are fit is that there is an
iterative algorithm that narrows down whatever the fit method is,
and then says, okay, the data are fit well enough. This particular
model in the software had been set in a default mode a long time ago
to not iterate many times. And probably it needed to be iterated
more times than it did in our original paper. We identified this a
couple of years later, and when we reran our data, the main message
was the same, but the estimate of the overall effect dropped. And in
fact, what we had done and the way we used the software was what
everybody had done and did afterwards, as well, and a number of
people ended up redoing their analyses. That’s one thing you
always worry about, some unknown issue of methodology. We’re using
very sophisticated tools and there’s always the possibility of
some methodological glitch you don’t understand.
I think for those of us who model data, we’re always concerned
that there’s some aspect of how we model the data that will
mislead us, and this is particularly dangerous when you’re
estimating these kinds of effects that are not huge. We’re not
detecting the kind of effect you can see visually. Part of the
strength of our approach is that we take all these cities and
analyze all the data at once and apply our models systematically.
Another concern with the literature before our study is that people
might have been selective in their modeling. They might have tended
to report the models that gave the strongest and most often
statistically significant approach. You also worry about publication
bias. We just had an interesting paper published in Epidemiology,
in which we essentially compared our multi-city approach, this time
for ozone, to what was in the literature, where people had taken
some of the same cities individually that we had used for unified
data. And we showed clearly that the published single-city estimates
tended to be much higher than those we had arrived at.
has this research evolved in the last five years since you published
We’ve taken the next step and we’re doing several things. One
is that we joined with colleagues in Europe and Canada to put all
the evidence together from around the world. The other data set we’ve
turned to, which I think is extraordinarily powerful, is Medicare,
which is basically an ongoing cohort study of 40 million people, age
65 and over. What we’ve done now is taken this Medicare data,
which includes death and hospitalization, and we’re joining that
with the air pollution data. We’re also looking at the effect of
smaller particles that are being measured by the EPA. We have an
idea that what we’re doing should be set up almost in a
surveillance fashion. We have the Medicare data that’s ongoing;
the air pollution data is ongoing. We’re trying to show that these
can be used together.
The other thing we’ve done, credit for which goes to Scott and
others, is to make our data and methods available. We have a web
site where we can post those data, along with the code. The idea is
that our findings should be robust to reanalysis.
an ideal world, how would you further test the hypothesis that
particulate air pollution has adverse effects on mortality?
Dr. Jonathan M. Samet
Just to dwell on the epidemiology approaches: there are two ways
to pursue it. One is to do more observational studies—more time
series, for instance, like we’ve been doing—and the other line
of approach that may help give evidence for causal influence, is to
take advantage of circumstances in which there are sharp changes in
exposure to air pollution. People are now studying several of these
types of situations: the city of Dublin, for example, instituted a
coal-burning ban and abruptly changed the nature of air pollution.
Hong Kong removed sulfur from automobile fuels. So in these cases,
exposure changes in time and decouples itself from changes in
confounders. It gives us an opportunity to perhaps be stronger in
inferring cause if the data supports the hypothesis.
School of Hygiene and Public Health
Johns Hopkins University
Baltimore, MD, USA
ESI Special Topics,
Citing URL - http://www.esi-topics.com/airpoll/interviews/JonathanSamet.html