Introduction

Respiratory diseases are common throughout the world, affecting a large fraction of the population every year1,2,3. These diseases affect all age groups, although children are more vulnerable; out of nearly 13 million annual deaths of children below the age of five years in the developing countries, a large fraction is due to acute respiratory diseases (ARD)2. Globally, ARD is responsible for about 20% of the deaths of children below the age of five3. In the United States of America, respiratory diseases kill more than 400,000 each year, making them the third leading cause of death4. However, ARD are often not given due importance in discussions and decisions about public health. One of the challenges in dealing with ARD is that these conditions can take many forms. In particular, ARD often also include acute respiratory infections (ARI). Some disorders, such as asthma, are common; others, such as the Hermansky-Pudlak syndrome, are rare4. In addition to the extreme situation (death), morbidity and clinical efforts can result in loss of time, wellness and increase in anxiety. While effective measures at home as well as public health care are needed for control of ARD, it is necessary to identify the major drivers and preferably precursors, for effective and pro-active mitigation.

Susceptibility to ARD depends on a number of local factors and habits; while adequate protection like clothing may not be an important factor at many locations, such parameters may play important roles in some cases. Studies based on respiratory disease in two eco-climatic zones in Nigeria (Humid-Forest and Derived-Savana) have revealed significant variations in the morbidity rates over the two locations5. Because ARD is an infectious disease, habitats with high population density are particularly vulnerable. In Delhi, ARD continue to be a major health hazard, with several thousands deaths, many among them children, reported every year2. A characteristic feature of ARD deaths in Delhi is its complex inter-annual variability, which is a major challenge in designing quantitative pro-active measures, like advance planning of medication.

A number of drivers, biological, chemical and meteorological, control the incidence and severity of ARD. A well recognized association of ARD is with air pollution, especially for noninfectious acute respiratory diseases or disorders such as asthma. Several studies have shown close relationship between air pollution and mortality6,7,8. High concentration of pollutants like suspended particulate matter (SPM), respirable suspended particulate matter (RSPM), sulphur dioxide (SO2) and nitrogen dioxide (NO2) are well recognized health hazards9,10,11. Increase in RSPM concentration can result in increased heart-rate, decreased heart rate variability and increased cardiac arrythmias12,13. The PM10 and PM2.5 can also penetrate deep into the pulmonary interstitial spaces in the lungs, thus provoking inflammation due to their higher diffusion coefficient14. Recent studies in several countries show association between infant mortality and RSPM15,16,17 and suggest that RSPM is associated with postneonatal mortality (deaths occurring after 28 days of life), with respiratory causes having the highest association15,16. Other pollutants like SO2 can also induce ARD symptoms; especially in children and people with chronic respiratory disorders and cardiopulmonary diseases7. Similarly, high levels of NO2 can damage the respiratory tract and can increase susceptibility to and the severity of, respiratory disorder and asthma. Because of its relative insolubility in aqueous surfaces, the upper airways retain only small amounts of inhaled NO2. These pollutants can therefore individually and collectively cause ARD.

A relatively less explored driver of ARD, particularly in India, is weather conditions, especially temperature. The possible roles of meteorological variables in acute respiratory tract infection have been considered in some earlier studies18,19,20. Because of the infectious nature of many ARD, prevalence and severity can depend on social processes like crowding. In addition, seasonality of ARD, particularly for specific infectious agent such as respiratory viruses are well proven; for instance, influenza peaks in winter as the virus survives better under cold and dry conditions18. Earlier studies on animals (guinea pigs) have also shown that ARD (spread of influenza virus) is a function of both ambient relative humidity and temperature6. However, unlike animals in a laboratory environment, spread of disease among humans is a function of complex socio-economic factors like exposure and crowding. A study more focused on association between low temperature and dryness (low humidity) among a divers sample (military recruits, asthmatic as well as non-asthmatic individuals) in Finland showed discernible associations between ARD and low temperature as well as low humidity20. In general, low temperatures are found to be associated with a wide spectrum of ARD; the physiological evidence for causality of effect of temperature on mortality is greatest for cardiovascular, followed by respiratory diseases21. It has been reported that a “cold” day has an effect that lasts up to two weeks, while the effect of a “hot” day is apparent for only a few days in the mortality series. Studies also show that mortality rates in winter are 10–25% higher than death rates in summer in many temperate countries, although the causes of this winter excess are not well understood22,23,24,25. It has been suggested that cold-related mortality in temperate countries is related in some part to the occurrence of seasonal respiratory infections25. Many studies report the seasonal patterns of infectious diseases; however, the relative roles of the variables have not been clearly identified. At the same time, seasonal rainfall like the monsoon is likely to modulate the transmission of many infectious diseases. Population-based studies also provide evidence that environmental temperature affects mortality due to both cardiovascular and respiratory diseases. However, there is little recorded evidence of an association between weather conditions and measures of morbidity such as hospital admissions or primary care consultations26. A study of general practitioner consultations among the elderly in Greater London found that temperature affected the rate of consultation for respiratory diseases but not that for cardiovascular diseases27. However, it is not clear how these end-points relate to quantitative measures of health burden.

Although association between low temperature and deaths due to ARD has been noted in earlier studies for some regions18,19,20,28,29, quantitative assessment in a region-specific manner is generally missing. Depending upon the threshold, the average number of cold days over Delhi can vary from 103 (Tc ≤ 20°C) to 145 (Tc ≤ 24°C) to 180 (Tc ≤ 27°C), with interannual variability (standard deviation) around 7-9 days (see Supplementary Fig. S1). It is likely that inadequate protection against cold condition is a major cause for ARD in Delhi, especially among the children of the poorer section. It must be noted that the effect of low temperature will depend on the local climate; for a region like Finland, the average temperature considered may be about 10°C20; a much higher average (and hence threshold) has to be considered for some regions. Thus, an important question is the identification of the thresholds for defining cold days. While in some cold countries, a temperature of 20°C may be quite normal, such a day could be considered a cold day in many tropical countries.

The primary objective of our work is to establish relative roles of pollution and cold days in a quantitative frame work. We first identify relative roles of different pollutants and the number of cold days in ARD morbidity and death. Next, we present relations between ARD and parameters like atmospheric pollution and weather conditions (number of cold days) with quantitative definition of threshold for cold days. Finally, we explore the degree to which ARD deaths can be represented through the number of cold days alone and discuss the feasibility of applying techniques of weather forecasting and dynamical air pollution models for forecasting number of ARD cases for pro-active mitigation.

We shall estimate the number of cold days based on daily average temperature. While minimum daily temperature may be also relevant, we have considered daily average temperature to allow effects of exposure during the entire day. An obvious reason for considering weather variables (temperature) as a major driver of ARD is that ARD is prevalent not only in major metros with significant atmospheric pollution, but also in areas with low or essentially no air pollution, although the number of cases may be low. We have considered multi-source data and a quantitative relation for estimating ARD load (Equation (1), Methods Section). ARD is also characterized by strong seasonality with winter-summer maximum and minimum; this indicates presence of thresholds in the dynamics of ARD. Conceptually also, the effect of temperature is expected to be threshold based; we have incorporated this concept through consideration of cold days. Thus one novel aspect of our formalism is to identify the cold days, rather than temperature itself, as a driver of ARD. This is distinct from using a meteorological variable, like temperature, directly in a statistical approach, like non-linear regression. In our formalism, a cold day (a human-centric and disease-centric measure) rather than actual temperature (a meteorological parameter), is the driver that leads to (increased) ARD (such as infection through exposure and crowding). It must be emphasized, however, that a ‘cold day’ is only a proxy driver for actual casual factors for ARD, like transmission due to increased crowding. Acute respiratory disease can be caused by infectious as well as non infectious agents; in what follows we shall refer to them as ARD in which temperature and air pollution may play direct or indirect roles.

Deaths resulting from diseases like ARD are complex functions of many factors, from timely detection to the access to health care to immunological history and socio-economic conditions. In particular, for economically heterogeneous societies, it is necessary to consider affordability of preventive measures, like warm clothing, in assessing causes of ARD. However, we may assume that the number of ARD deaths, in general, will be proportional to the number of ARD cases.

Results

Association between ARD and cold days

At monthly scale, the number of cold days and the morbidity rates for 2004 are well correlated (Fig. 1), with a correlation coefficient significant at 95% level for the degree of freedom involved. It may be noted that the morbidity rates do not go to zero even during the months when the number of cold days is zero. The presence of non-zero morbidity even in the absence of cold days indicates likely roles of other mechanisms and drivers of ARD; while this residual morbidity may be due to many unaccounted factors, air pollution is likely to be a primary one. Besides, we have not considered infectious and non-infectious cases separately; availability of data for a separate analysis can help to resolve if this residual morbidity is primarily related to non-meteorological factors.

Figure 1
figure 1

Month-wise morbidity rates per 100 children in Gokulpuri for 20042 and the monthly number of cold days for different threshold temperature: Tc (≤20°C) in top panel, (≤24°C) in middle panel and (≤27°C) in bottom panel over Delhi as indicated.

The number of cold days (daily average temperature) is based on daily meteorological data from NCEP38 and IMD40 over Delhi. The first number in the bracket represents the correlation coefficient between morbidity rate and the number of cold days for the respective case; the second number represents the significance levels for corresponding correlation coefficients.

Our premise that cold days is a major driver of ARD is further strengthened by the fact that the numbers of ARD cases as percentage of population over several other states are much higher than those over Delhi (Fig. 2); however, these states are relatively less urban and less affected by air pollution. We next investigate association between ARD and atmospheric pollutants over Delhi.

Figure 2
figure 2

Annual number of ARD cases (as percentage of total population) and annual ARD deaths (as percentage of number of cases) for four states over India36,42.

The correlation coefficients between the number of cases and the number of deaths are given in the respective panel, the number within bracket represents the significance level for the corresponding (positive) correlation coefficients.

Association between ARD and atmospheric pollutants

Although the spectrum of air pollutant like PM2.5 primarily responsible for ARD is relatively well known30, we have examined the relative roles of various pollutants in ARD for Delhi. Based on available data, we have considered the association between ARD morbidity rates and four pollutants for the twelve months in 2004. A comparison of the observed morbidity rates with the pollutant concentrations for SPM, RSPM, SO2 and NO2 shows (Fig. 3) clear association with RSPM, as expected, but also with NO2; there is no significant correlation with SPM or SO2. In our subsequent discussion, therefore, we shall only consider RSPM and NO2.

Figure 3
figure 3

Month-wise morbidity rates per 100 children in Gokulpuri2 and load of observed pollutants over Delhi for the year 2004 as indicated.

The numbers in each panel represent the correlation coefficients between number of deaths and pollutant concentrations. The first numbers in bracket in each panel represent the correlation coefficients between morbidity rate and pollution; the second number provides the significance level for respective (positive) correlation coefficients.

Quantitative estimation of ARD load

Based on the above results, we have considered a mathematical expression to represent the number of ARD deaths as a function of both number of cold days and pollutant concentration (Method section). The combined effect of air pollution and temperature on ARD is a complex function of time of exposure, level of pollution at the time of exposure and other factors like clothing. At the same time, neither air pollution, nor the weather variable (temperature) alone explains the inter-annual variability of ARD deaths (Fig. 4) nor the annual cycle of morbidity (Fig. 3); thus, both air pollution and the number of cold days play important roles. As mentioned above, there are likely to be additional drivers; however, our focus is on a quantitative estimation of how much of the ARD load and variability can be explained by these two drivers alone.

Figure 4
figure 4

Observed number of deaths from ARD for 2000–2005 (bars) and the number of hazardous days (observed).

The number of cold days (Tc ≤ 20°C) is calculated based on daily temperature from NCEP38 daily reanalysis and IMD40 average over Delhi. The corresponding observed ARD deaths have been adopted from Government of NCT of Delhi35. Different thresholds for hazardous days are based on 1. cold days: ≤20°C, 2. days ≥ permissible level of RSPM, 3. days ≥ (2 × permissible level of NO2). The first number in bracket in each panel represents the correlation coefficients between number of deaths from observations and estimates; the second number provides significance level for the corresponding correlation coefficient.

The number of annual ARD deaths estimated using observed number of cold days, observed RSPM and observed NO2 compares well with the observed ARD deaths (Fig. 5, bottom right panel). The effect of absence of RSPM in the process is not significant (Fig. 5, top left panel) while the absence of temperature significantly reduces the forecast skill (Fig. 5, bottom left panel); absence of NO2 has no appreciable effect (Fig. 5, top right panel).

Figure 5
figure 5

Number of deaths from ARD for the years 2000–2005 from observation (filled bar) and estimates with observed values of pollutant, the number of cold days (Tc ≤ 20°C) and the number of hazardous days.

Different thresholds for hazardous days are based on 1. cold days: ≤20°C, 2. days ≥ permissible level of RSPM, 3. days ≥ (2 × permissible level of NO2). The daily temperature data is from NCEP38 and IMD40 over Delhi. The corresponding observed ARD deaths have been adopted from Government of NCT of Delhi35. The two numbers in each panel represent the correlation coefficients between observed and estimated ARD deaths and the corresponding significance level for the respective case. The first numbers in bracket in each panel represent the correlation coefficients between number of deaths from observations and estimates and the second number provides significance level for respective correlation coefficients.

To further quantify the relative roles of RSPM, NO2 and the number of cold days, we have examined best estimates with a single variable using equation (1); the best estimate is obtained by allowing the coefficient to vary to attain minimum average error (equation (2)). These estimates show RSPM and the number of cold days to have comparable effects (Fig. 6), while NO2 is found to have only negligible role.

Figure 6
figure 6

Reported (filled bar) and estimated (hollow bar) number of deaths from ARD for the years 2000–2005, with estimated number of deaths based on (a) hazardous days only in terms of observed values of RSPM (≥permissible level of RSPM) (b) hazardous days only in terms of observed values of NO2 (≥2x permissible level of NO2) (c) only number of cold days (bottom panel).

The daily temperature data is from NCEP38 and IMD40 over Delhi. The corresponding reported ARD deaths have been adopted from Government of NCT of Delhi35. The value of the parameter that provides maximum skill for the corresponding cases is given in the header. The first numbers in bracket in each panel represent the correlation coefficients between number of deaths from observations and estimates; the second number represents the significance level for corresponding correlation coefficients.

In terms of the mean ARD, estimates with only RSPM as well as with all the three processes show comparable results; however, standard deviation (as percentage of the respective mean) is underestimated in general (Table 1). The highest correlation between the observed and the estimated inter-annual variability (Table 1) is once again seen for all the three processes combined (Table 1). However, estimations with only the number of cold days have average relative error comparable to that with all the three processes, with NCEP data (Table 1); it is important to note that this error does not decrease appreciably when the pollutants are included (Table 1).

Table 1 Mean, standard deviation, correlation coefficients and average absolute error of observed and estimated ARD deaths (2000–2005) from observed inputs

We have only considered the period 2000–2005 to examine the relative roles of air pollution and cold days because of availability of both daily values of pollution and daily temperature only for this period. However, association between the number of cold days and ARD deaths could be examined for a longer period (1991–2011), for which both daily temperature and ARD deaths were available. The number of cold days (Tc ≤ 20°C) from both IMD and NCEP daily temperatures show (Fig. 7, top panel) significant (>90% level) correlation with the actual number of ARD deaths in Delhi. Also there is no appreciable trend in either the number of cold days or in the number of ARD deaths. However, the number of ARD deaths as percentage of (annual) population shows significant negative trend (Fig. 7, bottom panel). For this longer period also, the threshold of 20°C for cold days provides the best and consistent estimates (see Supplementary Fig. S2). As noted earlier, death due to ARD depends on a number of (non-meteorological) factors. Thus the decreasing trend in ARD as percentage of population, in spite of no decreasing trend in the number of cold days, may be attributed to factors like better health care.

Figure 7
figure 7

Number of ARD deaths reported during 1991–2011 with (a) percentage of cold days (≤20°C) and (b) ARD deaths as percentage of population and annual population.

The corresponding observed ARD deaths have been adopted from Government of NCT of Delhi35. The daily temperature data is from NCEP38 and IMD40 over Delhi (1991–2005) and Station data (Safdarjung, Delhi) from Russia's Weather server (2006–2011)41. The first number in bracket in the top panel represents the correlation coefficients between number of deaths from observations and percentage of cold days; the second number provides the significance level for the corresponding correlation coefficient.

Analysis of cold days for the shorter period 1991–2011 is consistent with the results for the period 1969–2005. Both data sets (IMD and NCEP) show comparable number of cold days in a year as well as comparable inter-annual variability (standard deviation) for the three thresholds (see Supplementary Fig. S3), although NCEP data shows fewer number of cols days than IMD data for Tc ≤ 20°C. Importantly, the mean values and the standard deviations for the study period (1991–2011) are consistent and comparable with the corresponding values for the longer period (1969–2011).

While we have used the number of cold days as the driver for ARD, temperature itself may act as a driver for ARD; for example, we may assume the number of cases to be inversely proportional to temperature (equation 3). Further, in addition to temperature, the viruses also depend on other meteorological variables like humidity and rainfall, with direct proportionality (equation 4, 5); in particular, rainy days may also be associated with increased household crowding leading to ARD. In order to examine the roles of the meteorological variables as the drivers of acute respiratory tract disease, we have explored estimate of ARD based on each of the variables (temperature, rainfall and humidity) as well all the three variables combined (equation 36). It was found, however, that none of the variables or their combination (Fig. 8) produced any appreciable association with ARD.

Figure 8
figure 8

Number of deaths from ARD for the years 1991–2011 from observations and estimates with (a) only temperature, (b) only rain, (c) only relative humidity and (d) with all three variables.

The numbers in the bracket represent the correlation coefficients between observed and estimated ARD deaths. The weather variables are from NCEP daily reanalysis data38 (2.5° × 2.5°).

Discussion

Although deaths due to ARD depend on many factors like medication, accessibility to health care and timely intervention, both morbidity and deaths due to ARD are found to be strongly correlated with the number of cold days. The inter-annual variability in ARD deaths in Delhi is strongly associated with the number of cold days in a year. Thus, our results show that the number of cold (Tc ≤ 20°) days is a good estimator (proxy) for ARD related illness and deaths. Of course, air pollutants also play an important role, as expected. As noted earlier, low temperature (cold day) is a proxy for increased infection; however, it is directly and easily measurable and thus can be used for issue of advisories.

In addition to the deaths due to ARD, low temperatures also lead to a number of non-fatal illnesses like common cold, acute sinusitis, acute pharyngitis and acute tonsillitis that affect the people. Loss of working hours can lead to loss of income and increased stress; for students, loss of school days can lead to anxiety for the students as well as for the parents. Such diseases reduce income, sometimes critically for the poorer section, while also over burden the health services, especially over regions where outbreaks are seasonal (peaks in time) and affect large population. It is realistic to assume that the actual number of ARD incidences is much larger than the number of reported cases. Thus any measure to reduce ARD load, such as through effective advisory, can have significant positive impact.

In addition to low temperature, other meteorological variables like humidity, wind speed (Chill factor) and the boundary layer height (fumigation) are likely to affect ARD incidences19,20. Thus advisories based on meteorological variables as well as pollutant (RSPM) concentrations can provide useful pro-active mitigation. Proper investigations of association between weather variables and ARD are limited by inadequately designed observation programmes, both for weather variables and ARD incidences. Thus our study should be considered a basis for a comprehensive investigation with carefully designed observations. Symptomatically, there are a lot of similarities between ARD and asthma. It is possible that the estimates of ARD cases are biased by cases that in fact are asthma. However, this may not impact our basic premise and formalism as it only requires recalibration (scaling) of the model parameters.

We have used gridded data over Delhi to consider the large-scale environment. While station-scale data may be used for higher localization, such data cannot represent broad exposure scenario involving movement of people. However, studies based on ARD-associated hospitalization of children (< 16 years of age) and station-scale meteorological variables at the university of Mainz, Germany, also showed association between weather variables and ARD19; it was noted that certain pathogens were correlated with temperature and humidity. Our results also show that ARD can respond to climate change through its association with temperature; although the number of cold days in a year in Delhi does not show any appreciable trend in the recent past (see Supplementary Fig. S1), such effects may be important elsewhere. However, the definition of cold days is location-specific and can change due to adaptation.

An important finding is that daily temperature alone has significant association with ARD. As shown earlier, our conclusions regarding role of weather on ARD are also supported by analysis over other (three) different states of India; these regions (non-urban and non-industrial) are relatively much less affected by air pollution (Fig. 2). It is thus possible to take advantage of emerging effectiveness of high resolution (meso-scale) forecasts of atmospheric variables10,31 and use such relation to issue advisories for vulnerable areas; the predictive equation is capable of capturing the significant inter annual variability in the ARD deaths. It is worth emphasizing that although the skill of the weather forecasts for certain applications may not be adequate, in the present case only threshold-based (number of cold days) forecasts are used. At the same time, regulatory measures to reduce the number of hazardous days in terms of concentrations of RSPM and NO2 can bring down the ARD morbidity rates. Thus a combination of adaptation (through exposure advisories) and mitigation (through control of pollution) could see significant reduction in ARD deaths in Delhi.

For investigation of drivers of ARD, the relevant data is the number of reported ARD cases, since deaths due to ARD also depend on various other factors, as discussed earlier. However, the number of cases and the number of deaths as percentage of the number of cases are found to be strongly correlated for the two states, except Delhi and Kerala (Fig. 2). Still, it is desirable to generate data on actual ARD cases, such as through survey of hospital admissions, to establish more robust and direct relations. It will also help to obtain data separately on infectious and non-infectious ARD cases for improved modelling.

It needs to be emphasized that we have not considered different viability thresholds for individual respiratory viruses on infectious surfaces and in the air; such viability is likely to depend strongly upon meteorological variables like temperature and humidity. Thus a ‘cold day’ (vulnerability to ARD) may also depend on the overall meteorological conditions. Similarly, for the situation where ARD cases are primarily the result of overcrowding, ‘hot days’ and rainy days may also be associated with ARD. However, our analysis did not show any appreciable association between ARD and rainfall (Fig. 8). Still these issues need closer examination and for different locations and populations. Similarly, we have considered a single value for defining a cold day; however, in practice this may also depend on the individual. Incorporation of these refinements are challenging, but is expected to improve the applicability of the model.

As emphasized earlier, our work can be interfaced with dynamical forecasts of ARD at different scales. Feasibility and potential of such forecasts for air pollutants have been shown in our earlier works11,31,32,33. While the present work is focused on Delhi and a few other states in India, the methodology is quite generic and can be applied anywhere. However, the threshold for cold days for Delhi (≤20°C) is not necessarily applicable everywhere, as it will depend on the general climate of the region and tolerance of the people20,21; for generally cold countries, this threshold will be lower. Similarly, the parameters used in the estimation model will need calibration for a given region to implicitly allow for various local factors.

Methods

The data on ARD morbidity over Gokulpuri was adopted from an independent study carried out by a group and ARD deaths over Delhi was taken from the data provided by National Capital Territory (NCT) of Delhi. Often death due to low temperature (winter mortality) is calculated as the excess of deaths in the winter season over the non-winter season34. In our case, we have considered the actual ARD-related deaths and morbidity rates to exclude other winter deaths such as due to hypothermia. It is recognized that infection/death due to ARD is not generally synchronous with the occurrence of low temperature. The lag is typically a few days, most likely due to the incubation period for manifestation32, in case of morbidity or due to the treatment period in case of death. However, as we consider the total annual ARD load and the number of cold days in a year, this lag does not affect our calculations. No independent survey was undertaken by the authors and all patient data is anonymous.

Area of investigation

The study area is primarily the city of Delhi, India; situated around 77° E and 28° N, it experiences strong contrast in weather across the year, with about 103 days with daily temperature below 20°C (see Supplementary Fig. S1). In addition, we have considered three other states: Haryana, Himachal Pradesh and Kerala.

Data on ARD morbidity and deaths

The ARD morbidity data has been adopted from a report based on study carried out by a survey team consisted of 14 FETP participants (WHO fellows) at an urban slum area –‘Gokul Puri’ in Delhi, among under-5 children for 20042; two weekly incidences (i.e. 2 times of two week/month) of ARD per 100 children in Gokulpuri. The study covered a sample size of about 1400 under-5 children. Interviews of the caretakers of the under-five children were conducted. History of episodes of ARD and questions on health care practices were also asked of the mothers/caretakers whose children were suffering from ARD in the two weeks preceding the survey. The data is available at (medind.nic.in/icb/t07/i5/icbt07i5p471.pdf).

Year-wise distribution (1991–2011) of deaths from respiratory diseases in Delhi, Haryana, Himachal Pradesh and Kerala

The data of ARD death in Delhi has been taken from the annual report on registration of births and deaths: 2011, Government of National Capital Territory (NCT) of Delhi35 and the data for ARD cases for the other three states (Haryana, Himachal Pradesh and Kerala) have been taken from http://www.indiastat.com36.

Observed data on air pollution

The observed pollution data over Delhi was adopted from the data made available by Central Pollution Control Board (CPCB)37. This data was downloaded and published in our earlier works11,31, with reference to the source (CPCB, India). However, a current search did not give access to this data on CPCB, India, website. We have used the data already collected and published.

Meteorological data

The meteorological data to determine number of cold days were adopted from daily Reanalysis of NCEP available over a 2.5° × 2.5° global grid38, with appropriate interpolation39, IMD daily data (1.0° × 1.0°) over Delhi40 and Station data (Safdarjung, Delhi) from Russia's Weather server41. In addition, we have also utilized daily humidity near surface and daily rainfall from NCEP Reanalysis.

Threshold for cold days

The definition of a cold day, based on daily average temperature is location-dependent and depends on what is considered as the local winter. The threshold of 20°C was adopted based on this consideration as well as the best association with ARD (correlation coefficient above 90% significance) at seasonal level (Fig. S2). Although a threshold of 27°C may appear unrealistic as a threshold for defining cold days, we have considered it for two reasons: first, a daily average temperature of 20°C may easily be characterized by a minimum temperature; secondly, the observed and area average data may not account for small areas with lower temperature. Besides, there are various observational biases that may affect the threshold. While we have adopted the threshold of 20°C based on correlation between ARD morbidity and cold days at monthly scale, our subsequent analysis will focus on ARD deaths and the number of cold days at inter-annual scale; thus both time scale (annual vs monthly) and data (death vs morbidity) are different for calibration and analysis, making any association between ARD deaths and cold days at interannual scale highly non-trivial.

Population data

Population data for Delhi was collected from the report made available by Government of National Capital Territory (NCT) of Delhi (http://delhigovt.nic.in/newdelhi/dept/economic/arhtm.asp)42 and the data for Haryana, Himachal Pradesh and Kerala were collected from http://www.indiastat.com36.

Quantitative relation between ARD deaths and the number of hazardous days

To quantify the number of ARD deaths for the nth year in terms of hazardous days, we have used an equation

where, and represent the number of days which are above the permissible level for RSPM (75 μg−3) and NO2 (30 μg−3) respectively and represents the number of cold days.

The parameters α, β and γ are determined through a process of calibration in which each parameter is allowed to vary in a given range to arrive at the minimum error given by equation (2) for a chosen year; these values of the parameters are then kept constant for the rest of the years. We consider the error quantity

where, Se(i, λ) and So(i, λ) are respectively, the estimated and observed values of ARD deaths for the year n, for a given value of the parameter λ. For the standard case (with all the three processes present, Fig. 5), the values adopted through calibration are α = 9, β = 8 and γ = 11. In the case when a single or two variables were used, the calibration process was repeated to determine the parameter values for maximum skill for the respective case.

Quantitative relation between ARD deaths and meteorological variables

In addition to the number of cold days, we have also explored direct association between meteorological variables and ARD; we assume that ARD is enhanced by lower temperature, higher humidity and rainfall. In order to examine these associations, we have considered each variable separately as well as in combination through the following equations:

T (n, i), RH (n, i) and R (n, i) are respectively the temperature, relative humidity and rainfall for nth year and ith day. As in the case of equation (1) the parameters δ, ε η and ϕ are determined through a process of calibration. The calibration process was repeated to determine the parameter values for maximum skill for each case; the optimum values for δ, ε η and ϕ were adopted as 120000, 145, 7500 and 160 respectively for our analysis.