Introduction

Worldwide, more than 140 million people are chronically exposed to inorganic arsenic (iAs)1, and this exposure has been linked to both malignant and non-malignant pulmonary dysfunction2,3,4,5,6. As with many arsenic-associated diseases7,8,9,10,11,12,13, poor pulmonary outcomes are associated with an exposed individual’s ability to metabolize arsenic11,12,13. The metabolism of iAs is a multi-stage process, during which iAs can remain nmethylated, or exist in one of several methylated states14. In this process, the most abundant intermediate metabolite is monomethylarsenate (MMA), which can be further methylated to produce dimethylarsinate (DMA)14. People who are “inefficient metabolizers” are characterized by having higher relative percentages of iAs and MMA in their blood and urine, and “efficient metabolizers” are characterized by having a higher percentage of DMA. Arsenic metabolism appears to have an inherited component, as multiple genome-wide association studies have identified locations in the genome where variation is associated with an efficient arsenic metabolism phenotype15,16,17,18.

The health consequences of inefficient inorganic arsenic metabolism have largely been studied in populations who have been exposed to drinking water with high levels of inorganic arsenic (exceeding the 10 µg/L threshold established by the World Health Organization), or populations with known occupational exposure7,8,9,12,13. In populations with no known water or occupational exposure, arsenic was often assumed to be low, and in these populations, few studies have measured iAs metabolism. However, rice has recently been recognized as a significant route of exposure to inorganic arsenic19,20, suggesting that in populations with high rice consumption, arsenic exposure may reach a sufficient level such that inefficient metabolizers could be at increased risk for arsenic-associated diseases.

In study populations where inorganic arsenic metabolism has not been directly measured, but the study population has been genotyped, it is possible to study its health effects indirectly through the use of a two-sample Mendelian randomization approach21. This approach exploits the reported relationship between individual genotypes and the phenotype of arsenic metabolism15,16,17,18. Using an instrumental variable framework, these genotype-arsenic metabolism associations are then leveraged to estimate the arsenic metabolism efficiency in each genotyped study participant. This can then be combined with study-specific associations between genotype and pulmonary function. If certain assumptions are met, this estimate reflects the effect of arsenic metabolism efficiency on pulmonary traits in the target study population. As such, in populations where there may be arsenic exposure due to diet, a Mendelian randomization approach could identify whether inefficient arsenic metabolizers are at higher risk of disease.

This study examines evidence for an association between inorganic arsenic metabolism and measures of pulmonary dysfunction in a Hispanic/Latino population in the United States by implementing a two-sample Mendelian randomization approach. The study focuses on those with estimated higher inorganic arsenic exposure due to diet, as the effect of arsenic metabolism is only expected to be seen in those with appreciable arsenic exposure.

Methods

Study population

The Hispanic Community Health Study/Study of Latinos (HCHS/SOL) is a community-based cohort study of Hispanic and Latino adults in four cities in the United States: Chicago, Miami, the Bronx, and San Diego. The baseline examination22 took place between 2008 and 2011 enrolling 16,415 participants. After restricting to participants with genotyping and active consent (n = 12,633), and those without missing data (described below), 12,602 participants were included in the asthma analysis and 11,192 were included in the spirometry analysis.

Genotyping in HCHS/SOL

The genotyping and quality control for HCHS/SOL are described elsewhere23. Briefly, DNA from blood was genotyped on a custom Illumina HumanOmni2.5-8v1-1 array. Samples were excluded due to sex mismatch, chromosomal anomalies, high missing call rates, and evidence of contamination or batch effects. 12,633 samples passed quality control, had complete data on genetic ancestry, and active consent. Variants were excluded for high missing call rates, Mendelian errors, duplicate-sample discordance, and deviation from ancestry-specific Hardy–Weinberg equilibrium (p < 10−5). This cleaned data was then imputed to the 1000 Genomes Project phase 1 reference panel24. SHAPEIT2 (v.2.r644)25 pre-phased, and IMPUTE2 (v.2.3.0)26 implemented the imputation.

Identifying variants associated with arsenic metabolism efficiency

A literature search was performed through PubMed and the NHGRI-EBI GWAS catalog to identify genetic variants that influence arsenic metabolism efficiency. A variant was included if the p-value of the association was below 5 × 10–8, and the relationship was measured linearly as %iAs, %DMA, and %MMA. Three single nucleotide variants (SNVs) in two studies met these criteria (Table 1)7,27. While other variants have been identified (e.g. Refs.17,28,29,30,31), our approach required that the variants had their effects modeled linearly, a criteria met only by those listed in Table 1.

Table 1 Effect sizes and standard errors for SNV-arsenic metabolism efficiency relationships found in published literature.

Of the three identified variants, one (rs11191527) was directly measured by the HCHS/SOL genotyping array. The other two were imputed with high confidence (r2 = 1 for both using IMPUTE2 metrics).

Smoking status

The analysis was stratified by ever- and never-smoking history to allow for potentially different associations by smoking history. This was done to account for the effect tobacco smoking has on lung function32, the differential arsenic methylation profile seen in smokers33, and previous research that suggests that the health impacts of arsenic may be different in smokers34,35,36. Former smokers were grouped with ever smokers rather than never smokers in order to account for the sustained reduced lung function and increased asthma risk seen in those who quit smoking37,38. Ever- and never-smoking history was determined by self-report from the baseline questionnaire, with never-smoking defined as smoking fewer than 100 cigarettes over a lifetime. Thirteen participants did not answer the smoking questions and were therefore excluded. 5079 ever-smokers and 7541 never-smokers remained for analysis.

Classifying arsenic exposure through rice consumption

We ranked the study participants according to their inorganic arsenic exposure by inferring their rice consumption from answers to a food frequency questionnaire. In this study population, water is unlikely to be a source of arsenic exposure, as all participants were served by public water systems that show no evidence of arsenic contamination, with all four cities reporting arsenic concentrations below 2 µg/L or below the detectable limit39,40,41,42. While various foods can provide arsenic exposure, we focused on the variability in inorganic exposure through variability in rice consumption, since rice has consistently been identified as the main source of dietary iAs in studies that estimate dietary contributions of it43,44. Seafood can contain arsenic; however the arsenic species found in seafood is largely organic arsenobetaine, which is not metabolized, and does not have strong evidence of toxicity45,46. Although the level of arsenic found in rice can vary based on the location of cultivation, the harvesting method, and cooking style47, in the United States, most rice varieties available for purchase contain between 75 and 165 µg/kg of inorganic arsenic (along with DMA)48, such that among food products reported by HCHS/SOL, rice is the most commonly consumed food with high concentration of inorganic arsenic. Taken together, this suggests that characterizing rice consumption will provide an adequate ranking of participants with regard to their exposure to inorganic arsenic.

Rice consumption was inferred on the basis of two 24-h recalls of food intake, completed by HCHS/SOL participants, at enrollment and within one month of enrollment. Eleven participants did not complete the diet recall section and were excluded.

In the diet recall, the number of servings of “grains, flour, and dry mixes” that were eaten over the last day were assessed by three questions: one question focused on whole grain, the second on mixed-whole-grain, and the third on refined grains. While these three questions directed the participants to consider food in addition to rice (including amaranth, barley, buckwheat, corn flour, millet, oats, rye, sorghum, spelt, teff, triticale, quinoa, and wheat flour as well as rice), pasta and corn tortillas were assessed separately, suggesting that rice is the primary grain that participants would have referred to for this question. Further, in the subset of participants (n = 2048) who later answered a separate food propensity questionnaire, those who were above the 80th percentile in grain consumption were more than twice as likely to report eating rice more than once a week (OR 2.33, p = 1.05 × 10–4) and more than once a day (OR 2.69, p = 1.02 × 10–14), suggesting that grain consumption can be used to identify many of the high rice eaters.

In order to have a sufficient sample size for the binary outcomes assessed (statistical model described below), the top fifth of grain consumers (i.e.: cutoff at the 80th percentile) were classified as “high inferred consumers of rice.” These participants each reported more than 3.4 servings of grains (1.7 cups) over the two recalls (n = 2522 high consumers; 10,087 low consumers).

Assessment of pulmonary function and asthma history

All participants without recent cardiovascular events or surgery were invited to perform prebronchodilator spirometry (n = 12,095 with spirometry). Spirometry was conducted in accordance with European Respiratory Society and American Thoracic Society (ATS) guidelines49 using a dry rolling seal spirometer with automated quality checks (Occupational Marketing, Inc., Houston, TX) with overreading by one investigator and a three-curve acceptability minimum. Participants whose effort was rated as “maximal” and whose FVC quality attribute was rated “A,” “B” or “C” were included in the spirometry analysis (11,192 (89%) participants; A = “exceeds ATS data collection standards”, B = “meets ATS data collection standards,” C = “potentially usable value, but does not meet all ATS standards”; an industry-recognized rubric that classifies whether the measurement conforms to ATS standards for a usable value49). In order to better model the participants who used asthma medications [n = 899 (7%) on asthma medications], their FVC and FEV1 were multiplied by 0.88 (the mean difference found in Du et al.,50 attenuated to reflect imperfect medication adherence) to estimate a participant’s spirometry result without medication.

Pulmonary dysfunction is often characterized identifying those whose spirometry measures fall below a population-based Lower Limits of Normal (LLNs). These LLNs are identified by characterizing the distribution of each spirometry measure in a healthy reference population, within strata of ethnic background, sex, age, and height. The LLN threshold is defined at the fifth percentile of each spirometry measure in this reference population, and in the wider population, those below this LLN have been found to be at higher risk of respiratory-related morbidity and mortality51,52,53. HCHS/SOL participants’ spirometry measures of FEV1, FVC, the ratio of FEV1 to FVC, and PEF were compared to the respective distributions in a healthy Hispanic/Latino reference population54 to identify participants below the LLN.

History of asthma was assessed through self-report from a standard questionnaire55,56. Participants were classified as having lifetime asthma if they answered yes to “have you ever had asthma?” and “was it diagnosed by a medical doctor?” Asthma diagnosis was further refined as “current” asthma if the participant answered yes to “do you still have asthma?” or they reported use of an anti-asthmatic medication in the last year, and “past” asthma otherwise.

Statistical methods for SNV-pulmonary trait associations within HCHS/SOL

The relationships between each of the three SNVs and the pulmonary traits were calculated using a mixed-effect linear model23 using the GENESIS R package57. GENESIS controls for both the clustering and complex survey design utilized by HCHS/SOL, and makes use of mixed effect matrices to control for kinship, household, and block group. In addition, the top five principal components, genetically-ascertained ancestry group58, and the log of the sampling weights were controlled for as covariates via their inclusion as fixed effects. Each analysis was repeated within sub-strata of inferred rice consumption and smoking status.

Statistical methods for Mendelian randomization

To estimate the arsenic metabolism-pulmonary trait relationships, the SNV-trait associations (calculated as described above) were combined with the genotype-arsenic metabolism effect sizes and standard errors (as listed in Table 1) using the process described by Burgess et al.21 While there are several methods for incorporating multiple SNVs into Mendelian randomization analyses59,60, the Burgess method mitigates the possibility of inflating type I error due to correlation between the variants, (between rs9527 and rs11191527, R2 = 0.28, D′ = 0.87). In this method, principal components of the three identified variants were calculated; these principal components were then used as the instrument that estimated the effect of the genetically-influenced aspects of arsenic metabolism on the respiratory traits21.

Sensitivity to modeling assumptions

To assess whether medical diagnosis of asthma was sensitive to arsenic metabolism, the analysis was repeated including the 6% of asthmatics with self-reported asthma that had never been diagnosed by a doctor. To assess whether absolute levels of FEV1, FVC, and PEF were more sensitive to arsenic metabolism than the threshold of below LLN, each was analyzed continuously. Similarly, the percentages of FEV1, FVC, and PEF as a fraction of each’s predicted values were also analyzed continuously to determine whether those phenotypes were more sensitive to arsenic metabolism than the dichotomous outcome of being below the LLN. To evaluate whether the spirometry results were sensitive to the correction for medication use, the spirometry analyses were repeated on uncorrected data where medication use was controlled for as a confounder, and again in data where those on medications were excluded. To assess whether the results were sensitive to the quality of the spirometry measurements, the analysis was repeated excluding results where the quality of the FEV1 and FVC curves were rated “C”, which did not fully meet the American Thoracic Society standards, but was still rated as “potentially usable”.

Ethics approval and consent to participant

The study was conducted with the approval and oversight of the Ethics and Institutional Review Boards of Albert Einstein School of Medicine, University of Illinois Chicago (2013-1261), University of Miami, and San Diego State University, as well as the coordinating center institutions University of North Carolina Chapel Hill and University of Washington. Informed consent was obtained from all participants. All methods were performed in accordance with the relevant guidelines and regulations.

Results

Characteristics of the study population are found in Table 2. The Mendelian randomization estimates of the effect of arsenic metabolism and pulmonary function for high inferred consumers of rice are presented in Table 3, stratified by smoking status.

Table 2 Characteristics of the study sample.
Table 3 Mendelian randomization estimates for the associations between three measures of arsenic metabolism efficiency and asthma-associated traits among those with high inferred rice consumption (n = 2522).

Never smokers with high inferred rice consumption

Among never-smokers (Table 3, right), inefficient metabolism was associated with an increased risk of a past asthma diagnosis (OR 1.40, 95% CI 1.05–1.86 for %iAs; OR 1.26, 95% CI 1.03–1.54 for %MMA; OR 0.87, 95% CI 0.77–0.98 for %DMA). Additionally, FVC below the LLN was associated with inefficient arsenic metabolism (OR 1.42 95% CI 1.10–1.83 for %iAs; OR 1.24, 95% CI 1.03–1.50 for %MMA; OR 0.87, 95% CI 0.78–0.97 for %DMA). There was no strong evidence for association with any of the other tested spirometry comparisons to LLN.

Ever smokers with high inferred rice consumption

Among ever-smokers (Table 3, left), inefficient arsenic metabolizers were more likely to have a PEF below the LLN, as for each percentage-point increase in iAs, there was a 54% higher odds of PEF below LLN (OR 1.54, 95% CI 1.10–2.15). Similar detrimental effects on PEF were found with each percentage-point increase in MMA (OR 1.37, 95% CI 1.08–1.73). For each percentage point increase in DMA (the marker of efficient arsenic metabolism), there was a 17% decrease in the odds that a participant would have PEF below LLN (OR 0.83, 95% CI 0.72–0.96). There was no strong evidence for association with asthma or any of the other tested spirometry comparisons to LLN.

Low inferred rice consumption

Among intermediate- and low-consumers of grain, while the magnitude of the associations is often consistent with inefficient arsenic metabolism being associated with pulmonary function, the confidence intervals are generally wide. Given the borderline significance of the two tests that passed the significance threshold, there was no convincing pattern between arsenic metabolism and pulmonary function for either ever-smokers (Table 4, left) or never-smokers (Table 4, right).

Table 4 Mendelian randomization estimates for the associations between three measures of arsenic metabolism efficiency and asthma-associated traits among those with low inferred rice consumption (n = 10,087).

Sensitivity analyses

Broadening the definition of asthma history to include those who reported a history of asthma but did not receive a diagnosis leads to substantively similar results (Supplementary Table 1). Modeling the spirometry measures directly or as a percentage of the predicted value for their ethnicity, age, height, and gender were both less sensitive to arsenic metabolism than modeling whether the participant had passed the clinically relevant threshold of below the lower limit of normal (Supplementary Table 2), although the point estimates are consistent with a protective effect of efficient metabolism and a detrimental effect of inefficient metabolism.

The results were not sensitive to the method used to control for medication, or the quality of spirometry measures included in the analysis (tables available upon request).

Discussion

This analysis suggests that inefficient metabolism of inorganic arsenic is associated with a history of asthma and signs of pulmonary dysfunction. Further, we find that these effects were observed at levels of arsenic exposure that could be acquired through diets that are high in rice. The pulmonary traits that were most influenced by arsenic metabolism differed by smoking history. For never-smokers, inefficient metabolism was associated with increased odds of a past history of asthma and FVC being below the LLN. For each percentage-point increase in %iAs, the odds of a past history of asthma increased by more than 40%. A similar magnitude of effect was seen on the odds of FVC dropping below the LLN. Among ever-smokers, PEF was the most responsive spirometry measure to inefficient arsenic metabolism, with a percentage point increase in %iAs associated with a 60% increase in the odds that the participant’s PEF fell below LLN. There were similar detrimental effects seen for increases in %MMA, and protective effects for decreases in %DMA. Given that the risk alleles for inefficient metabolism were each associated with an increase in %iAs between 1.3 and 2.7 percentage points (Table 1), this suggests that arsenic metabolism may be responsible for significant variability in pulmonary function.

As we did not expect genotype to increase risk in the absence of arsenic, those participants whose inferred rice consumption was low served as a negative control. In this population a less-clear pattern emerged connecting the ability to methylate arsenic and pulmonary dysfunction (Table 4).

Our analysis utilized a Mendelian randomization approach to complement the existing literature that has suggested that arsenic affects pulmonary function at higher levels of exposure11,12,13. Our work builds upon this earlier research by providing additional evidence to support the hypothesis that at levels of exposure that are consistent with high rice consumption but no known water exposure, poor arsenic metabolizers may be at risk of pulmonary dysfunction, and we also find that FVC may be particularly sensitive to this effect4. Although our analysis focused on methylation, our findings are consistent with results from the MESA study, which suggest that spirometry-based measures of lung function may be worse in participants who were daily rice eaters61, and those found in the Strong Heart Study, which found that respiratory dysfunction is associated with even low-to-moderate arsenic exposure62. Our analysis reached a different conclusion than an analysis of the 2003–2006 National Health and Nutrition Examination Survey (NHANES), which found no relationship between inorganic arsenic exposure and diagnoses of multiple respiratory diseases63, and a later NHANES analysis that included spirometry64. However, the absolute level of inorganic arsenic exposure in the NHANES population was likely lower than what would be expected in populations with high rice consumption, and these analyses looked at absolute level of arsenic, rather than metabolism efficiency.

Further work is needed to understand why some respiratory phenotypes appear to be more sensitive to arsenic metabolism than others, and whether additional phenotypes, such as control of asthma among asthmatics, may also be affected. The association between past, but not current asthma diagnosis may reflect an increased sensitivity to arsenic toxicity in childhood36,65,66, but additional study is needed to clarify the association. Our observation that the association between arsenic metabolism and pulmonary function differs by smoking history is consistent with other research that has found differential health effects by smoking status36,67. Future studies that may clarify this potential interaction could help to identify those at greatest risk of pulmonary dysfunction.

The mechanisms that underpin this effect are not yet fully understood. In vitro studies have shown arsenic to increase oxidative stress in lung cells68, and also induce epigenetic changes in lung tissue69. Animal studies have observed accumulation of arsenic metabolites in lung tissues70, and markers of immune dysregulation71 and oxidative stress72 in the lungs of chronically exposed mice. In human models, markers of pulmonary inflammation were elevated in the sputum of exposed individuals73, as well as CC16, a marker of early lung damage5. Taken together, this suggests several mechanisms through which arsenic exposure, and arsenic metabolism can induce respiratory dysfunction, and additional investigations into these underlying pathways are warranted.

Mendelian randomization assumptions

The ability of our findings to reflect the influence of iAs metabolism depends on assumptions required of all Mendelian randomization and instrumental variable analyses, discussed below. While not fully testable, several lines of evidence suggest that the assumptions are valid.

  1. (1)

    the SNVs used in the Mendelian randomization are associated with arsenic metabolism.

    Since arsenic metabolites were not measured in HCHS/SOL, the first assumption cannot be directly tested. However, its plausibility is supported first by the multiple study populations in which variation near the AS3MT region has convincingly been associated with iAs metabolism15,17,18,74,75. Additionally, the SNVs selected for the instrument belong to biological pathways that are known to be involved in arsenic metabolism efficiency. rs9527 and rs11191527 are located in the region of the gene AS3MT, and rs61735836 creates a valine to methionine missense mutation in FTCD. Both genes encode enzymes (arsenite methyltransferase and formimidoyltransferase cyclodeaminase) involved in arsenic methylation14,76.

  2. (2)

    The SNVs-pulmonary function associations are not confounded by an unmeasured factor.

    The most common threat to this assumption is through uncontrolled population stratification in which SNVs would spuriously appear to be associated with pulmonary function due to non-genetically-influenced clustering of traits in people of similar genetic backgrounds. However, the methodology used to calculate the SNV-pulmonary function relationships within HCHS/SOL used an extensively validated algorithm which has demonstrated no inflation in type I error for multiple phenotypes within HCHS/SOL77. Our analysis controls for cryptic relatedness, sample clustering, and complex survey design through the mixed effects, as well as ancestral background groups, and principal components through fixed effects23,58.

  3. (3)

    The SNVs only affect pulmonary function through their effects on arsenic metabolism.

The ability to test this assumption is limited by our still-expanding knowledge of the effects of the genome. However, its plausibility is supported by the NHGRI-EBI GWAS catalog78, which lists no SNVs associated with any respiratory-related trait in high linkage disequilibrium (r2 > 0.3) with the three SNVs used in the instrument. This suggests that there are no other mechanisms through which these variants might affect respiratory function except through their influence on arsenic metabolism. While an assessment of the participants of the UKBiobank (UKBB) found a possible association between the AS3MT SNVs in our instrument and smoking status (http://www.phenoscanner.medschl.cam.ac.uk), the SNV-smoking effect sizes reported in the UKBB are small in magnitude and inconsistent in direction, and there is no association between the AS3MT SNVs and smoking in HCHS/SOL (p > 0.25 for both, with ORs close to 1), suggesting that the relationship seen in the UKBB is not responsible for the relationship observed in this analysis. Additionally, our decision to stratify by smoking status alleviates the concern that the variants act through influencing smoking status and downstream pulmonary function.

The results of this study could be strengthened if additional data were available to refine our classification of arsenic exposure in HCHS/SOL beyond the 24-h food recall of grain consumption. However, the public water systems of the HCHS/SOL participants show no evidence of elevated arsenic contamination79,80,81,82, so it is likely that dietary arsenic was a main source of exposure for most participants. While there is potential for misclassification, in that some high consumers of grain may have not eaten much rice, neither non-dietary routes of arsenic exposure, nor misreporting of food intake is likely to be confounded with genetics or respiratory dysfunction, and as such would only serve to introduce people into the analytic sample with low level of arsenic exposure, which would dilute our ability to estimate the effect of metabolism, but not bias the estimates.

As this analysis is one of the first to directly look at the effect of arsenic methylation capacity on pulmonary outcomes in a population with low-to-moderate exposure, we undertook several sensitivity analyses, and broadly tested multiple respiratory phenotypes, and these multiple tests increase the possibility of type I error. However, the positive associations coherently form a pattern that implicates inefficient arsenic metabolism as a risk factor for a broad range of respiratory outcomes, each of which have their own biological plausibility, and passed multiple sensitivity analyses.

In conclusion, those who inefficiently metabolize arsenic show an association with increased risk of measures of pulmonary dysfunction that are used in routine clinical practice, and our analysis suggests that dietary rice may provide enough arsenic exposure to observe this relationship. This suggests that arsenic metabolism may be a previously unappreciated risk factor for pulmonary dysfunction in the Hispanic/Latino community and in other populations in which rice is a dietary staple. These findings suggest that in addition to water-based arsenic exposure, diet should be considered as a possible route by which a participant may be within a level of exposure that can influence respiratory function, while also suggesting that a mitigation strategy aimed at rice could help to reduce the burden of respiratory dysfunction in inefficient arsenic metabolizers.