Introduction

Globally, consumption of nitrate (NO3) and fluoride (F) contaminated groundwater is a serious concern due to their role in causing clinical diseases in humans1,2,3,4,5. Among the different inorganic forms of nitrogen (NO3, NO2 and NH4+) that exist in aquifers, NO3 concentrations are higher than those of NO2 and NH4+ due to their high solubility and mobility rates as well as higher stable oxidative state in water6,7. Both NO2 and NH4+ are easily oxidized and converted to NO3; thus, they have lower contents in groundwater8. Anthropogenic sources that contribute to excess NO3 in groundwater system are overuse of N-fertilizers, excreta from livestock farms, municipal wastewater irrigation, runoff from urban and agricultural land, leaching from waste dumping sites and discharge of untreated sewage and industrial effluents9,10,11,12,13.

The natural sources of NO3 in groundwater are the dissolution and oxidation of nitrogenous minerals in the sedimentary and metasedimentary rocks. The bedrock nitrogen minerals, such as nitraline, nitre, suhalite and tobelite, have three possible origins: organic matter, ammonium silicates and nitrate and ammonium salts14. Dissolution of these sources release ammonium from their crystal lattices into the soil horizon, the chemical form which can be easily assimilated by soil micro-organisms or get converted to NO3 through the nitrification process for nitrogen fixation by leguminous plants15,16,17. On the other hand, the weathering process of organic nitrogen present in bedrocks are mineralized and converted to ammonium, which is readily used by the soil biota.

Recently, many workers highlighted the worldwide contamination of NO3 in groundwater and its adverse effect on human health. Some locational examples are Loess Plateau, Northwest China18, Weining Plain, Northwest China2, Matanza-Riachuelo River Basin, Argentina19, Donsheng district, Inner Mongolia20, Catalan Region, Spain21, Gorveh-Dehgelan, Western Iran22, Shanmuganadhi river basin, southern India23, Jalandhar district, Punjab, India24, Panipat district, Haryana, India25, Nagpur, Western Maharashtra, India26, Gaya district, Bihar, India8, Tiruppur, Tamil Nadu, India27.

Water with NO3 concentrations between 45 and 100 mg/L and above 100 mg/L are consumed daily for drinking purposes by ~ 118 million and ~ 108 million people, respectively, in India28,29. The common and predominant effect of excess NO3 content (> 45 mg/L) in bottle-fed infants and children is Methemoglobinemia disease24,30. Almasri concludes that the clinical symptom of methemoglobinemia is normally encountered as body dehydration and gastrointestinal infections31. Further, the biochemical effects of NO3 occur in the human body as follows: (a) NO3 is converted to NO2 under reducing conditions, (b) haemoglobin (Hb) combines with NO2 to form methemoglobin, (c) the effect of methemoglobin reduces the oxygen supply in red blood cells and drops the oxygen level in the body and (d) higher rate of methemoglobin formations (> 10%) leads to the blueish colouration of the skin, known as a blue-baby syndrome (cyanosis)4,32. The prolonged exposure to high NO3 content in water has other health risks, such as multiple sclerosis, nitrosamines and non-hodgkin lymphoma33,34,35.

The fluoride deposition on the earth's crust is approximately 0.32% and occurs mainly in rocks, such as granites and gneisses. Both natural and anthropogenic inputs contribute to F contamination in groundwater. However, the higher concentrations of F in groundwater is predominantly from geogenic sources and their exposure is a threat to human health36. The geogenic sources include presence of fluoride-bearing minerals, such as fluorite, amphiboles, topaz, apatite, fluorapatite, etc. in rocks, sediments and soils, evapotranspiration and atmospheric deposition37,38,39. Prominent anthropogenic sources are the applications of pesticides and phosphatic fertilizers, industrial effluents and landfills40,41,42.

Lower F concentrations (< 0.5 mg/L) in drinking water cause dental carries and concentrations between 0.6 and 1.5 mg/L are essential for bone formation and development of skeleton and teeth in the human body4,24,43. The long-term exposure to F concentrations above the recommended guideline/permissible limit (1.5 mg/L) may cause dental fluorosis, discoloration, pitting and mottling of teeth, skeleton fluorosis (4–8 mg/L), osteoporosis, arthritis, thyroid, rheumatic pain, kidney problem, muscle stiffness and abnormalities in red blood cells (> 10 mg/L)38,44,45,46,47. Globally, at least 200 million individuals are affected by acute fluorosis in 28 different nations due to the consumption of F contaminated groundwater48. In India alone, ~ 25 million individuals are affected by endemic fluorosis in 20 states besides ~ 66 million people in the risk of developing fluorosis, including ~ 6 million children below 14 years of age49,50,51,52. Mukherjee and Singh have made a detailed review of F contamination in groundwater in different states of India53.

Supebeda, the study area of this contribution, is a small village situated in the border of Chhattisgarh and Odisha States in a tribal stretch in central India (Fig. 1). Groundwater is the primary source of water in this region. In recent years, the local inhabitants have been facing numerous medical problems related to severe renal issues, kidney diseases and fluorosis. As per the media reports, there have been more than 100 causalities till date due to these diseases in recent years and around 300 villagers are battling for life. Thus, the study area has become a hotspot to understand the real reason for the peoples’ health problems. Several research organizations, such as the Indian Council of Medical Research, Geological Survey of India, Indira Gandhi Agricultural University, Chhattisgarh State Public Health and Engineering Department, Pandit Ravishankar Shukla University, National Institute Technology (Raipur), have already visited the village for investigation purposes. The research angles by many of these organizations have been genetic genesis, food habits, consumption of spurious liquor and other medicinal causes54,55,56. Presently, there is no literature available on the geochemistry of groundwater quality and associated health hazard risks to the local population. Preliminary sampling suggested NO3 and F contamination in groundwater57. The present investigation, therefore, is aimed at making a detailed appraisal of the groundwater quality, non-carcinogenic health risk assessment in humans based on deterministic and probabilistic approaches, hydrochemical characterization, source apportionment of NO3 and F through chemometric techniques and their relationship with other ions in groundwater.

Figure 1
figure 1

Groundwater sampling around village Supebeda in Chhattisgarh State, India: The village borders the state of Odisha on its east. Groundwater samples were collected from 27 locations marked by black dots. The map was prepared based on MapInfo 8.5 (https://www.precisely.com/product/precisely-mapinfo/mapinfo-pro).

Materials and methods

Study area

The study area, village Supebeda, lies between North latitudes 19° 50′ and 19° 54′ and East longitudes 82° 38′ and 82° 42′ occupying a geographical area of 3 km2 in the administrative block of Deobhog in Gariyaband district of Chhattisgarh State, India (Fig. 1). Situated on the bank of the river Tel, it borders the State of Odisha in the east. With a population of about 1200 people, it has nearly equal male–female sex ratio and literacy rate of 50.51%. The village has a Gram Panchayat (village council). The region is endowed with a sub-tropical monsoon climate with three distinct seasons: the southwest monsoon starts from mid-June to September; the winter season spreads from October to February and the summer season extends from March to mid-June. The average annual rainfall is 1200 mm, and the temperature in winter varies from 5 to 25 °C and in summer from 29 °C to 46 °C.

Local geology

Gupta et al.58 and Neogi and Das59 have conducted detailed study on the geology of the area. As per this study, there are three major lithological units in the area from east to west, i.e., (i) migmatiticquartzo feldspathic gneiss, (ii) banded augen gneiss and (iii) hornblende granite (Fig. 2). Migmatiticquartzo feldspathic gneisses are grey-colored, medium-grained rocks with finely laminated alternations of felsic (quartz + plagioclase + K-feldspar: Qtz + Pl + Kfs) and mafic (Bt + Hbl-rich) bands. Leucocratic segregations are found extensively and are generally stromatically folded into or parallel to the layering. There is occasional presence of orthopyroxenes in migmatite gneisses as greasy, green patches with diffuse margins (‘patchy charnockite’). Bands of migmatized mafic granulites, metapelitic rocks (infrequently sapphirine-bearing) and rare calcsilicate granulites, besides isolated appearance blastoporphyritic charnockite, occur congruently with the gneisses.

Figure 2
figure 2

Geology and hydrogeology of the area around village Supebeda in Chhattisgarh State, India: The area represents a metamorphic terrain with a complex geology58. The arrow marks show the groundwater flow in different directions. Well drilled in the charnokite-khondalite complex are high-yielding with a yield potential of 3–5 L per second. The map was prepared based on MapInfo 8.5 (https://www.precisely.com/product/precisely-mapinfo/mapinfo-pro).

Banded augen gneisses are pink-colored, medium- to coarse-grained rocks. The bandings within them are defined by mafic and felsic layers with K-feldspar (Kfs) augen and quartz lenticles. There is occasional occurrence of leucosomes in a narrow zone closer to the migmatitic quartzofeldspathic gneiss unit with sharp abetment to the west. The gneissic fabric generally precedes the leucosomes in banded augen gneiss. Hbl-rich and Pl + Cpx-rich layers are hosted thinly within banded gneisses. Amphibolites (Hbl + Pl ± Grt ± Cpx) and calc-silicate gneisses with these thin layers are mesoscopic to the regional scale bands.

Intruding into the banded gneiss is the pink-colored, coarse-grained hornblende granite that consists of microcline, quartz, hornblende, and biotite. With intense shearing and mylonitization along its eastern fringe, it has poor presence further westward.

Hydrogeology

Groundwater occurs under unconfined condition in weathered portions of rocks and semi-confined to confined conditions in their fractured parts, i.e., in charnockite and khondalite, at depth. The shallow aquifer occurs within an average depth of 16 m. The configuration of water table in the shallow aquifer follows the topography due to which the groundwater movement is generally toward valleys or topographic lows. The water bodies, such as tanks, canals, and streams. also influence the occurrence and movement of groundwater in shallow aquifer. This aquifer is developed mostly by dugwells in the area with their depth ranging between 7 and 16 m. In general, the yield of dugwells ranges from 25 to 40 m3/day. Deeper aquifer in the area is formed mainly of granitic rocks and is developed by borewells with a depth range of 50–80 m. In general, the yield of borewells ranges from 85 to 430 m3/day.

The groundwater flow is analyzed based on the water table elevation contours. In northern part of the study area, groundwater flow is toward the south, i.e., the Tel River, while the flow is toward the north in the southern part. The water table elevations in the study area range between 240 and 260 m above mean sea level with northern part having higher groundwater table elevation. Transmissivity ranges from 15 to 45 m2/day in charnockite and khondalite and at favourable places it goes up to 100 m2/day. The potential fractures for boreholes up to 80 m depth are recorded at various depths, i.e., 40–45, 60–65, 75–80 m, and are 3–4 in numbers. Hydrogeology of the study area is shown in Fig. 2.

Water sampling and analysis

Groundwater samples from twenty-seven locations were collected from the dugwells and borewells in and around the Supebeda area during pre-monsoon season (May 2020) (Fig. 1). Plastic bottles (HDPE) of 1000 ml capacity were used. These bottles were prewashed with HNO3 (10%) and rinsed with double deionised water. At the time of sample collection, groundwater sources were flushed for 10–15 min to obtain a fresh solution by removing the stagnant water in the pipe. The sampling bottles were thoroughly rinsed 2–3 times with the fresh groundwater to be collected to preserve the original characteristics of the sampled water. Some basic parameters, such as pH, electrical conductivity (EC) and total dissolved solids (TDS), were immediately measured onsite after the collection of groundwater samples using a pH/EC/TDS meter (Hanna HI 9811-5). Whatman filter paper (0.45 μm) was used to remove the suspended particulate matter. The samples were preserved by acidifying (pH ~ 2 with HNO3) and kept at a temperature of 4 °C. Standard protocol prescribed by the American Public Health Association was followed for the investigation of major cations (Ca2+, Mg2+, Na+, and K+) and anions (HCO3, Cl, SO42−, F, and NO3). Merck-GR grade chemicals and reagents were used to prepare the chemical solutions using double deionized water. All the glassware and apparatus were soaked with 10% hydrochloric acid (HCl) for one day and cleaned with double deionized water. Blank samples were prepared from the stock solutions of each parameter for instrumental calibration. The accuracy of analysing datasets was computed using the charge balance error (CBE) equation (Eq. 1), and each sample value was within its error limit of ± 5%60.

$${\text{CBE}}\% = \frac{{\sum \left( {Cations} \right)meq/L - \sum \left( {Anions} \right)meq/L}}{{\sum \left( {Cations} \right)meq/L + \sum \left( {Anions} \right)meq/L}} \times 100$$
(1)

Human health risk assessment (HHRA)

Human health risk assessment (HHRA) is the quantitative risk analysis of potentially harmful chemical parameters present in water on human health through various pathways and specific time periods61,62. It has four distinct steps: (1) hazard identification, (2) exposure assessment, (3) dose–response assessment and (4) risk characterization4,63.

The significant pathways for risk analysis on human health from chemical exposure are ingestion and dermal contact. In the present study, the average daily dose (ADD) of ingestion and dermal pathways for target chemicals, namely NO3 and F, are employed to determine the non-carcinogenic HHRA as shown in Eqs. (2) and (3)64. The assessments of ADDingestion and ADDdermal are computed on four different age groups, i.e., infants (< 1 year), children (1–11 years), teens (11–18 years) and adults (above 18 years). The adverse impact of the target parameters on human health may vary due to physiological and behavioural attributes, organ development factors and tolerance responses to the specific chemicals in the human body.

$$ADD_{ingestion} = \frac{{C_{M} \times IR_{w} \times EF_{r} \times ED}}{{BW \times AT_{r} }}$$
(2)
$$ADD_{dermal} = \frac{{C_{M} \times SA \times K_{p} \times EF_{r} \times ED \times ET \times CF}}{{BW \times AT_{r} }}$$
(3)

(The parameters/variables used in these equations are defined in Supplementary Table S1).

The ratio of the potential adverse non-carcinogenic risk from each exposure pathway (ingestion and dermal) with respect to the corresponding reference dose of a chemical parameter is estimated through hazard quotient (HQ)64, as shown in Eqs. (4)–(5). Hazard index (HI) is the combined non-carcinogenic hazard risks of a particular parameter from all different possible exposure routes65,66. Both HI and HQ are unitless values. When HQ > 1, it is indicative of potential health effects from a specific exposure route67. Similarly, the value of HI > 1 depicts the adverse non-carcinogenic toxicity in each target age group67.

$$HQ_{ingestion } = \frac{{ADD_{ingestion} }}{{RfD_{i} }}$$
(4)
$$HQ_{dermal } = \frac{{ADD_{dermal} }}{{RfD_{d} }}$$
(5)
$$HI_{M } = \mathop \sum \limits_{i = 1}^{n} HQ_{ingestion} + \mathop \sum \limits_{i = 1}^{n} HQ_{dermal}$$
(6)

where \(HI_{M }\) indicates the total hazard index of a specific parameter, and i represents the exposure route of a specific subpopulation group considered in the present study (Eq. 6). The ingestion and dermal pathways reference dose (RfD) values for NO3 and F are 1.6 mg/kg per day and 0.06 mg/kg per day, respectively4,68,69,70.

Both deterministic and probabilistic approaches are applied to determine the potential non-carcinogenic HHRA in the present study. The deterministic approach simply incorporates fixed values on the mathematical formula developed by USEPA for different exposure pathways71. The point estimation results generate only a single value that may underestimate or overestimate the risk analysis. Normally, the values of the variables of the point estimation vary with respect to climatic conditions, place, time, chemical concentrations in water and receptor types (i.e., body weight, exposure frequency and different subpopulation groups)72, but since the uncertainty of the deterministic model considers only a fixed value for every input variable, this technique is a conservative risk assessment approach.

Probabilistic technique, namely Monte Carlo Simulation (MCS), is an alternative statistical model that offers a sound methodology and provides holistic information for risk assessment suggested by USEPA72. Monte Carlo Simulation is a computer software application configuring a statistical distribution array in the form of probabilistic approximation of a mathematical equation to generate more corroborated reproducibility results and reduces the uncertainty associated in risk analysis4. Oracle Crystal Ball software version (11.1.2.4.850) is used for the MCS study. The operation of MCS requires prearrangement of input variables/parameters with respect to their maximum, minimum, mean, and standard deviation (SD) values to define best-fitted statistical distribution types to generate their probability distribution functions (PDFs)72. The input parameters, such as ingestion rate (IRw), exposure frequency (EF), exposure duration (ED), expose skin surface area (SA), exposure time (ET) and body weight (BW), generally have 10,000 repetitions for the computation of risks from oral ingestion and dermal contact for each subpopulation group. Thus, the numerical stability of MCS is obtained at 10,000 permutations for HQ and HI4,73,74. The sensitivity analysis is also employed to extract the significant input variables impacting the outcome of a simulation model for potential risks.

In this work, the target parameters, i.e., NO3 and F, are defined by the auto-select to determine the best-fitted probability distribution pattern based on their concentration values. Their goodness of fit (GoF) statistical outcomes are presented in Table 1. The values and types of distribution of various input variables for ingestion and dermal pathways for the deterministic and probabilistic models are provided in Supplementary Table S1.

Table 1 Best fitted and goodness of fit (GoF) outcomes of the probability distribution of Nitrate and Fluoride in the groundwater around village Supebeda in Chhattisgarh State, India.

Chemometric analysis

Chemometric statistical models, such as principal component analysis (PCA) and cluster analysis (CA), are widely used by many researchers to distinguish among the probable sources of chemical parameters in water11,75,76,77,78. Principal component analysis enables extraction of valuable information and better interpretation of statistically significant parameters from large, complex datasets79. The present study uses z-scale standardization of all chemical parameters to generate dimensionless values80,81,82. Varimax rotation method has been employed to extract the principal components (PCs). The PCs with eigenvalues > 1 are statistically significant for interpreting the hidden factors in water quality83,84.

Cluster analysis has been used to create similar groups from a different set of objects or variables85. Ward’s linkage and squared Euclidean distance have been applied on z-transformation data to obtain different clusters86. The cluster significance has been assessed using Sneath’s test method87. Minitab 17 and MS Office 2021 have been employed to perform the statistical analysis.

Results and discussions

Table S2 lists the concentrations of various physicochemical parameters in analyzed groundwater samples. Table 2 gives the statistical description of physicochemical parameters [range, mean, and standard deviation (SD)] and percentage of samples above the BIS46 and WHO45 standards. Water samples are neutral to slightly alkaline in nature with the pH values ranging from 7.2 to 8.3 with a mean of 7.9 (mean ± SD = 7.9 ± 0.3). EC values show wide variation from 313.0 to 3446.0 µS/cm with 11.11% samples above the guideline value of 1500 µS/cm45. High EC values at some locations cause salinity due to excessive mineralization in groundwater. The water quality classification based on EC values88 indicates that 62.96% of the samples are moderately saline, 26.63% are medium to highly saline and 7.41% are highly saline for irrigation purposes (Table S3). Further, classification by FAO89 shows that 7.41% samples are above the standard EC range (0–3000µS/cm) for irrigational use (Table 2).

Table 2 Descriptive statistics of chemical parameters of groundwater samples collected from Supebeda, district Gariyaband, Chhattisgarh, India.

Groundwater samples with TDS values above acceptable limit of 500 mg/L and permissible limit of 2000 mg/L46 for drinking purposes are 22.22% and 11.11%, respectively, of all collected samples. These 11.11% of the samples are above the normal range of 0–2000 mg/L89 for irrigational use as well. As per Davis and DeWiest90, about 63% of the samples are within the desirable limit for drinking purposes and about 26% in between desirable and permissible limit of 1000 mg/L (Table S3). Further, the TDS classification by Freeze and Cherry91 indicates that majority of the groundwater samples (88.89%) falls under freshwater and the rest 11.11% under brackish water category (Table S3).

Total hardness (TH) values vary from 65.0 to 755.0 mg/L with 33.33% and 11.11% samples above the acceptable (200 mg/L) and permissible limits (600 mg/L), respectively46. The elevated level of TH is primarily linked with the excess concentrations of Ca2+, Mg2+ and HCO3 ions in groundwater11,76. Classification of groundwater based on TH values by Sawyer and McCarty92 divulges that 3.70% of the samples are soft, 22.22% are moderately hard, 44.44% are hard and 29.63% are very hard in nature (Table S3). Further, TDS versus TH plot depicts that the groundwater is fresh to brackish water types with moderately hard to very hard in nature (Fig. S1). Sindhu concludes that the prolonged consumption of very hard water is associated with calcification of arteries, urolithiasis, anencephaly, and gastrointestinal tract irritation93. Box-Whisker plot shows the relative abundance and dominance of various cations (Ca2+ > Na+ > Mg2+ > K+) and anions (HCO3 > Cl > NO3 > SO42− > F) in groundwater (Fig. S2).

Major parameters

About 26.9% and 3.7% samples show Ca2+ contents above the acceptable limit of 75 mg/L and permissible limit of 200 mg/L, respectively46. The Mg2+ mean ± SD is 21 ± 13.2 with 11.11% of samples above the acceptable limit of 30 mg/L46 (Table 2). The alkali metals, i.e., Na+ and K+, are within their respective guideline values (200 mg/L and 12 mg/L)45. HCO3 concentrations range from 85 to 519 mg/L with 3.7% of samples above the guideline value of 500 mg/L45. Chloride (Cl) concentrations vary from 7.1 to 408.3 mg/L, with 11.11% of samples above the acceptable limit of 250 mg/L46. The excess concentrations of Ca2+, Mg2+, HCO3 and Cl ions are the key chemicals resulting hardness of groundwater24. The level of SO42− ions in groundwater is within the acceptable limit of 200 mg/L46. The concentrations of cations (Ca2+, Mg2+, and Na+) and anions (HCO3, Cl, and SO42−) are within their normal ranges for irrigational use (Table 2)89.

Health risk parameters

Consumption of NO3 and F-rich water causes various health risks in humans. The NO3 content in groundwater in Supebeda exceeds its guideline value of 45 mg/L for drinking and irrigation purposes46,89 in 37.0% of samples (Table 2). The classification of NO3 concentrations based on Adimalla43 signifies that 62.96% of the samples have no risk (< 45 mg/L), 29.63% of samples have high risk (45–100 mg/L) and the remaining 7.41% of samples have very high risk to human health (> 100 mg/L) (Table S4).

The range of F concentrations varies from 0 to 1.9 mg/L with 14.8% and 25.9% of samples above the acceptable limit of 1.0 mg/L and permissible limit of 1.5 mg/L, respectively46. Although about 55.56% of samples have the required F level (0.6–1.5 mg/L) for human health, as per Adimalla43, 22.22% of samples may cause dental caries (< 0.5 mg/L) and an equal percent dental fluorosis (1.6–2.0 mg/L) (Table S4).

Health risk assessment (HRA)

Table S5 provides the calculated average daily dose (ADD) values of NO3 and F through ingestion and dermal contact of groundwater using deterministic and probabilistic approaches for different age groups. Tables 3 and 4 show estimates of the non-carcinogenic HRA with respect to hazard quotient (\(HQ\)) and hazard index (\(HI\)) parameters, respectively. The deterministically calculated mean, median, 5th percentile (minimum) and 95th percentile (maximum) values of \(HQ_{ingestion}\), \(HQ_{dermal}\) and \(HI\) for NO3 and F are relatively more than those of the probabilistically estimated values in all target population groups. The mean and 95th percentile of \(HQ_{{{\text{NO}}_{3}^{ - } }}\) for ingestion pathway are above the acceptable limit (i.e., \(HQ\) > 1) in the deterministic study, which indicates that the potential non-carcinogenic risk shall affect the larger sections in all target populations. On the other hand, the probabilistically calculated \(HQ_{{{\text{NO}}_{3}^{ - } }}\) for ingestion pathway is above the threshold limit (i.e., \(HQ\) > 1) only at 95th percentile, which shows that the clinical risk of non-carcinogenic effect is a concern to the sensitive sections of people in all subpopulation groups at the extreme point (Table 3).

Table 3 Statistical description of deterministically and probabilistically calculated hazard quotient (HQ) for ingestion and dermal pathways in different age groups.
Table 4 Statistical description of deterministically and probabilistically calculated hazard index (HI) for ingestion and dermal pathways in different age groups.

In deterministic estimate, the ingestion route of \(HQ_{{{\text{F}}^{ - } }}\) shows mean and 95th percentile above the safety limit (\(HQ\) > 1) only in infants, and rest of the population groups (Children, Teens, and Adults) have \(HQ_{{{\text{F}}^{ - } }}\) > 1 in 95th percentile. In probabilistic study, the threat of non-carcinogenic hazard divulges at the maximum point (\(HQ_{{{\text{F}}^{ - } }}\) 95th percentile > 1) through ingestion pathway in the infants and children's groups (Table 3). In dermal contact, the deterministically and probabilistically calculated mean, median, 5th percentile and 95th percentile values \(HQ_{{{\text{NO}}_{3}^{ - } }}\) and \(HQ_{{{\text{F}}^{ - } }}\) are less than the threshold limit (\(HQ\) < 1) in all target population groups. This indicates that there is no potential non-carcinogenic health risk through dermal contact from the indicator parameters (Table 3).

Risk certainty level (RCL) is assessed to generate the likelihood percentage scenarios of non-cancer hazard quotient risk above the threshold value (\(HQ\) > 1) in all individual datasets of a particular pathway. It is always advantageous to determine the RCL value in HRA for any exposure pathway, even if the mean, 5th percentile and 95th percentile values of different age groups are below their threshold limits. Among the target age groups, the order of deterministic RCL (\(HQ\) > 1) for NO3 and F through the ingestion route is infants (\(HQ_{{{\text{NO}}_{3}^{ - } }} =\) 51.85% and \(HQ_{{{\text{F}}^{ - } }}\) = 66.67%) > children (\(HQ_{{{\text{NO}}_{3}^{ - } }}\) 48.15% and \(HQ_{{{\text{F}}^{ - } }}\) = 33.33%) > adults (\(HQ_{{{\text{NO}}_{3}^{ - } }} =\) 40.74% and \(HQ_{{{\text{F}}^{ - } }}\) = 33.33%) > teens (\(HQ_{{{\text{NO}}_{3}^{ - } }}\) = 37.04% and \(HQ_{{{\text{F}}^{ - } }}\) = 25.93%) (Table 3). Similar findings of NO3 and F non-carcinogenic health risk for groundwater ingestion pathways are found in Jiangcungou, Northwest China (i.e., children > adults > teenagers)1 and Nalagarh valley, Himachal Pradesh, India (i.e., infants > children > adults > teenagers)4. On the other hand, the probabilistic RCL (\(HQ\) > 1) orders for NO3 and F through ingestion pathway are infants (\(HQ_{{{\text{NO}}_{3}^{ - } }}\) = 34.02% and \(HQ_{{{\text{F}}^{ - } }}\) = 24.17%) > children (\(HQ_{{{\text{NO}}_{3}^{ - } }}\) = 23.00% and \(HQ_{{{\text{F}}^{ - } }}\) = 10.55%) > teens (\(HQ_{{{\text{NO}}_{3}^{ - } }}\) = 13.16% and \(HQ_{{{\text{F}}^{ - } }}\) = 2.00%) > adults (\(HQ_{{{\text{NO}}_{3}^{ - } }}\) = 11.62% and \(HQ_{{{\text{F}}^{ - } }}\) = 1.25%) (Table 3).

The deterministic and probabilistic RCLs (\(HQ\) > 1) indicate trivial non-carcinogenic risks from the indicator parameters (NO3 and F) through the dermal route. Therefore, the perusal of Table 3 shows that NO3 and F exposure through direct groundwater consumption has higher non-carcinogenic \(HQ\) by several orders of magnitude than that of the dermal route in all age groups. Liu get similar findings of non-cancerous health risks from the groundwater of Weining plain, China72. Further, among the indicator parameters, the mean, median and 95th percentile values of \(HQ_{{{\text{NO}}_{3}^{ - } }}\) are more than those of \(HQ_{{{\text{F}}^{ - } }}\) through the groundwater ingestion pathway within each stratified age group in both deterministic and probabilistic approaches (Table 3).

Hazard index (HI)

The non-carcinogenic \(HI\) is the combination of non-carcinogenic hazard quotient risk factors of each indicator parameter (NO3 or F) through multi-exposure pathways (ingestion and dermal) of groundwater, as shown in Table 4. The mean, median and 95th percentile values of infants and children in the deterministic result exceed the safety reference level of \(HI_{{{\text{NO}}_{3}^{ - } }}\) > 1, divulging prominent threat level of non-carcinogenic HHR from NO3 in these age groups. The remaining population groups (teens and adults) in deterministic study and all the target population groups in probabilistic estimate reveal the non-carcinogenic risk of NO3 at 95th percentile values only (\(HI_{{{\text{NO}}_{3}^{ - } }}\) > 1).

With respect to F, in the deterministic study, the potential non-cancerous effect is prominent in infants since the mean, median and 95th percentile values are above the safe reference limit (i.e., \(HI_{{{\text{F}}^{ - } }}\) > 1), but the rest of the subpopulation groups show \(HI_{{{\text{F}}^{ - } }}\) > 1 in 95th percentile only, which shows that the threat of health risk is still persistent in the sensitive sections of the stratified age groups at the extreme value. On the other hand, in the probabilistic estimate, the \(HI_{{{\text{F}}^{ - } }}\) results indicate that the infants and children’s groups are at the risk of non-carcinogenic effect at 95th percentile values, i.e., \(HI_{{{\text{F}}^{ - } }}\) > 1.

Accordingly, in the deterministic output, the RCL magnitude of non-carcinogenic \(HI_{{{\text{NO}}_{3}^{ - } }}\) risk stands at infants (51.85%) > children (48.15%) > teens (40.74%) = adults (40.74%), and that of \(HI_{{{\text{F}}^{ - } }}\) at infants (66.67%) > children (37.04%) > adults (33.33%) > teens (25.23%) (Table 4). The probabilistically calculated RCL health risks in the subpopulation groups are in the following order: infants (\(HI_{{{\text{NO}}_{3}^{ - } }}\) = 34.03% and \(HI_{{{\text{F}}^{ - } }}\) = 24.17%) > children (\(HI_{{{\text{NO}}_{3}^{ - } }}\) = 23.01% and \(HI_{{{\text{F}}^{ - } }}\) = 10.56%) > teens (\(HI_{{{\text{NO}}_{3}^{ - } }}\) = 13.17% and \(HI_{{{\text{F}}^{ - } }}\) = 2.00%) > adults (\(HI_{{{\text{NO}}_{3}^{ - } }}\) = 11.62% and \(HI_{{{\text{F}}^{ - } }}\) = 1.25%).

The deterministic RCL for \(HI\) is more than the probabilistic RCL in all age groups divulging that the deterministic estimation is based on the extreme (single point) values (please see Table S1, fifth column) for all input variables individually at different concentration levels of the indicator parameters. Since these extreme (single point) values may not always represent the actual field conditions, the deterministic estimates often lead to overestimation of the output results (Table 4). Therefore, the deterministic approach cannot cater to the holistic scenario of risk assessment for the inclusive members of the population interests due to differences in person-to-person characteristics and dynamism prevailing in the environment.

The probabilistic approach gives a range of values to choose from depending on the most likelihood field conditions (please see Table S1, eighth column). Therefore, the probability approximation of events reduces the uncertainties by providing more accurate and prospective risk assessment outcomes than those of the conventional deterministic approach. Liu72 too conclude that the health risk assessment in groundwater through probabilistic simulation provides more comprehensive results.

The present study, however, suggests that the HRA of the indicator parameters should be studied using both deterministic and probabilistic approaches mutually to obtain more holistic outputs, thereby reducing the uncertainties and overcoming the conservative risk analysis of the point estimation. In a similar line, Kaur25 conclude that the deterministic and probabilistic methods may be studied independently to assess non-carcinogenic HHRA (NO3 and F) in groundwater.

Sensitivity and uncertainty analysis

Deterministic technique does not provide any provision for sensitivity and uncertainty analysis. Therefore, sensitivity analysis has been carried out in the probabilistic process of working using the Monte Carlo Simulation (MCS) approach to extract the most influential input variables for the non-carcinogenic risk prediction. Figure 3a,b represents the tornado plots showing the percentage scales of all input variables for non-carcinogenic \(HI_{{{\text{NO}}_{3}^{ - } }}\) and \(HI_{{{\text{F}}^{ - } }}\) in the stratified age groups. The sensitivity analysis validates that the variables of dermal route are not vividly influenced in the overall contribution of non-carcinogenic \(HI\) in all subpopulation groups and that the input variables of the ingestion pathway have more potential non-carcinogenic health effects than those of the dermal contact. The \(HQ\) results are further supported and validated by the sensitivity analysis of tornado plots. For \(HI_{{{\text{NO}}_{3}^{ - } }}\) sensitivity output, the parameter concentration (CM ingestion) is the most influential variable followed by exposure duration (EDingestion) with minor contributions from ingestion rate (IRingestion) and exposure frequency (EFingestion) in all target populations. It indicates that higher NO3 content in ingested water will have more health implications, but as per Carlsson94, 60–70% of the intake NO3 dose is generally excreted within the first 23 h in urine. Therefore, possibly the clinical NO3 toxicity in humans is less significant because of the limited exposure duration of NO3 intake dose in the body.

Figure 3
figure 3

Tornado plots illustrating sensitivity analysis of input variables to the non-carcinogenic hazard index (HI) of groundwater: (a) NO3 and F ingestion and (b) for dermal contact: This sensitivity analysis figure is drawn by the probabilistic approach using the Monte Carlo Simulation (MCS) technique to extract the most influential input variables for the non-carcinogenic risk prediction for human health. The length of horizontal bars indicates the percentage contribution of various input variables to extract the non-carcinogenic hazard index (HI) of different age groups.

The results of sensitivity analysis \(HI_{{{\text{F}}^{ - } }}\) for infants and children stand in the order of EDingestion > CMingestion > IRingestion > EFingestion. In infants and children, 80% of the oral F intake is absorbed in the body with storage in the bones and95. Thus, exposure duration is the most significant input variable due to high retention of F intake dose in infants and children. The tornado \(HI_{{{\text{F}}^{ - } }}\) plots for teens and adults show the percentage of contribution variables as CM ingestion > EDingestion > IRingestion > EFingestion. For teens and adults, ~ 50% of an orally ingested F is retained in the body95,96. Thus, the lower retention potential of F dose in teens and adults compared to that in infants and children indicates that the parameter concentration is the main driving force for fluoride toxicity in the sensitivity outputs. The body weight (BWingestion) variable negatively infers non-carcinogenic \(HI_{{{\text{NO}}_{3}^{ - } }}\) and \(HI_{{{\text{F}}^{ - } }}\) simulations in all age groups (Fig. 3a,b).

Uncertainty analysis is crucial in determining the conservatism, ramification, and certainty accuracy level of the risk analysis results97. In this study, the application of MCS is notably enhanced to identify and quantify the uncertainties in the non-cancer HRA. Nevertheless, there are still other uncertainties that remain unaccounted in the model input variables, thereby limiting the validity of the whole scenario study. For example, (i) the daily water intake and dermal contact of target population groups are not measured during the groundwater sampling, (ii) body weights of the local population are not evaluated (instead, the representative data of the Indian Council of Medical Research (ICMR) and USEPA are used), (iii) average time, dermal permeability and conversion factor are considered as the same, fixed or similar values for deterministic and probabilistic approaches for different age groups, (iv) the variables data to generate the probability distribution functions (PDFs) using MCS are acquired from the USEPA and other relevant published literatures, (v) assumption that the concentrations of specific chemical parameters in groundwater are fully bio-absorbed in the human body may lead to ambiguity in risk analysis, and finally (vi) the reference doses (RfD) for ingestion and dermal exposures are obtained from USEPA.

Hydrogeochemical processes

Gibbs diagram is applied to elucidate the mechanism controlling groundwater chemistry in the study area98. This diagram enables understanding of the relationship between cation ratio [Na+/(Na+ + Ca2+)] or anion ratio [Cl/(Cl + HCO3)] versus TDS, thereby defining three distinct areas, namely evaporation, rock-water and precipitation zones (Fig. 4) that depicts that majority of the groundwater samples (88.89%) are clustered in the rock dominance zone and the remaining samples (11.11%) fall in the evaporation zone.

Figure 4
figure 4

Gibbs diagram representing the factors controlling groundwater chemistry: This diagram enables to understand the relationship between cation ratio [Na+/(Na+ + Ca2+)] or anion ratio [Cl/(Cl + HCO3)] versus TDS defining three distinct areas, namely evaporation, rock-water and precipitation zones to elucidate the dominant mechanism influencing the groundwater chemistry of the study area.

Groundwater chemistry is primarily influenced by various geochemical processes, especially the interaction of percolating water with subsurface rocks and the chemical solute exchange processes of aquifer minerals in the study area. Many researchers conclude that the elevated concentrations of F in groundwater are proportionately related to rock-water interaction99,100,101,102. Besides the rock weathering processes, climatic factors too play a critical role in regulating the evaporation in the semiarid region26. The scattering of samples in the Gibbs diagram signifies the impact of anthropogenic inputs in the aquifer systems. The role of the evaporation factor enhances the groundwater salinity by elevating the Na+ and Cl ions, resulting in the higher TDS concentrations, which are further abetted by anthropogenic activities103.

Piper diagram is a widely used graphical interpolation to characterize the hydrochemical interaction, water genesis and groundwater contamination sources29,104,105. Figure 5 depicts that the groundwater is predominantly dominated by alkaline earths over the alkalies and weak acids over the strong acids. This is represented by three hydrochemical facies, namely Ca2+–Mg2+–HCO3 (55.56%), Ca2+–Mg2+–Cl–SO42− (29.63%) and Na+–K+–HCO3 (14.81%). Also, the groundwater samples are further classified into four water types, i.e., Ca2+–HCO3 (55.56%), Ca2+–Cl (7.40%), Ca2+–Mg2+–Cl (22.22%) and Ca2+–Na+–HCO3 (14.81%). The highest percentage of Ca2+–HCO3 water type indicates dissolution of carbonate minerals with percolating water from irrigation runoff and precipitation in the subsurface aquifers11,76. The cations triangle shows that majority of the samples (70.37%) belong to no-dominant zone, and the remaining samples of 11.11%, 14.82% and 3.70% represent water types in Ca2+, Na+ and Mg2+ dominated zones, respectively. In the anions triangle, around 70.37% samples fall in HCO3 water type, which indicates weathering of carbonates and silicates minerals and ion exchange processes in the groundwater106. Approximately 22.22% of the samples belonging to Cl water type depict the role of anthropogenic factors and dissolution of evaporities in the groundwater26. The transformation of water types from Ca2+–HCO3 to Ca2+–Cl and Ca2+–Mg2+–Cl types divulges the adverse impacts of human activities and applications of N-chemicals on cultivated lands, thereby elevating the NO3 concentrations in groundwater107,108. Further, the conversion of water from Ca2+–HCO3 to Ca2+Mg2+–Cl and Ca2+–Na+–HCO3 types is due to the dissolution of fluorite minerals (CaF2) and cation exchange between Ca2+ and Na+109,110. Subba Rao suggests that the weathering of rocks, higher Na+ and HCO3 (or NaHCO3) and alkaline nature of water favour the gradual increase of F concentrations in groundwater100,111.

Figure 5
figure 5

Piper diagram illustrating hydrochemical facies and water types: This graphical interpolation enables characterization of the hydrochemical interaction, genesis of water and groundwater contamination sources. Black arrows signify the conversion of water types due to anthropogenic and geogenic factors.

Source apportionment and geochemical relationships of NO3 and F with other parameters

Many workers have studied the relationship of nitrate and fluoride with specific parameters through scatter plots. For example, for nitrate: NO3 versus pH76, NO3 versus Cl−17, NO3 versus K+, NO3 versus Ca2+, NO3 versus SO42−, NO3 versus Cl112, NO3 versus EC, NO3 versus Cl, NO3 versus K+, NO3 versus SO42−, NO3 versus Na+, NO3 versus Ca2+, NO3 versus Mg2+, NO3 versus HCO327, and for fluoride: F versus pH, F versus HCO376, F versus HCO3, F versus Na+, F versus NO3102, F versus pH, F versus Ca2+113, F versus pH, F versus Na+, F versus K+, F versus HCO3, F versus Ca2+114. However, these studies have not evaluated NO3 and F holistically for their geochemical relationships with physical parameters and major cations and anions and also their source apportionment with site-specific datasets available. The present study is unique in the sense that it uses scatter plots to correlate NO3 and F with other physicochemical parameters independently (pH, EC, TH, Ca2+, Mg2+, Na+, K+, Cl, HCO3, SO42−, and F versus NO3) to achieve these objectives.

Source apportionment and geochemical relationship of NO3 with other parameters

A strong inverse correlation between NO3 and pH (r2 = 0.688 and y =  − 0.0061x + 8.0993) indicates decreasing pH values with increasing NO3 concentrations (Fig. 6a). Dadgar and Payandeh115 too report this relationship in Tabriz province, Iran. The oxidation of dissolved CO2 in groundwater forms carbonic acid and readily dissociates into H+ and HCO3 ions is an intensive process24. Further, NO3 ions rapidly react with free H+ ions to form HNO3 resulting in acidic conditions at higher NO3 concentrations (Eq. 7).

$$\left. \begin{gathered} {\text{CO}}_{2} + {\text{H}}_{2} {\text{O}} \to {\text{H}}_{2} {\text{CO}}_{3} \left( {{\text{Carbonic}}\;{\text{acid}}} \right) \hfill \\ {\text{H}}_{2} {\text{CO}}_{3} \to {\text{H}}^{ + } + {\text{HCO}}_{3}^{ - } \hfill \\ {\text{NO}}_{3}^{ - } + {\text{H}}^{ + } \leftrightarrow {\text{HNO}}_{3} \left( {{\text{Nitric}}\;{\text{acid}}} \right) \hfill \\ \end{gathered} \right\}$$
(7)
Figure 6
figure 6

Source apportionment of NO3 is carried out with the help of such interactions.

Scatter plot correlations between NO3 and (a) pH, (b) EC, (c) TH, (d) Ca2+, (e) Mg2+, (f) Na+, (g) K+, (h) HCO3, (i) Cl, (j) SO42− in groundwater samples: Each plot signifies the relationship of NO3 with a particular physicochemical parameter to understand their geochemical interaction.

The scatter plot of NO3 versus EC shows a positive correlation (r2 = 0.5185), divulging higher mineralization of dissolved substances, including excess NO3 concentrations in groundwater (Fig. 6b). The samples with NO3 contents above the guideline value of 45 mg/L46 have higher EC in groundwater. Such a relationship is often associated with anthropogenic inputs, such as agricultural runoff, domestic sewage, poultry farming and unplanned urbanization, which release an enormous quantity of organic nitrogen and ammonia76,116,117,118. Ammonia is affectively absorbed in the soil particles that restrict its movement. During the limited aerobic condition in the soil, the nitrification process converts the immobilized ammonia into nitrate by bacterial activities, as shown in Eq. (8). Anthropogenic inputs accelerate the nitrification process that enhances easy leaching of NO3 from the soil in the percolating water recharging the aquifers.

$$\left. \begin{gathered} \mathop {2{\text{NH}}_{3} + 3{\text{O}}_{2} }\limits_{{({\text{Ammonia}})}} \to \mathop {2{\text{NO}}_{2}^{ - } }\limits_{{({\text{Nitrite}})}} + 2{\text{H}}^{ + } + 2{\text{H}}_{2} {\text{O}} \hfill \\ 2{\text{NO}}_{2} + {\text{O}}_{2} \to 2{\text{NO}}_{3}^{ - } \left( {{\text{Nitrate}}} \right) \hfill \\ \end{gathered} \right\}$$
(8)

The possible mineral source contributing Ca2+ and Mg2+ in the groundwater is determined by Ca2+/Mg2+ ratio119. Figure 6d,e depicts the positive relationship of NO3 with Ca2+ (r2 = 0.6998) and Mg2+ (r2 = 0.5672), which indicates cation exchange processes in the groundwater due to prolonged application of N-fertilizers for crop production120. This cation exchange process significantly enhances the mineralization of Ca2+ and Mg2+ and elevates NO3 concentrations. Also, the nitrification process increases the NO3 level and acidity in groundwater resulting in Ca2+ and Mg2+ enrichment by the dissolution of carbonate minerals76,121, as illustrated in Fig. S3a, i.e., 14.8% samples by dolomite and 48.2% by calcite in our study area. The remaining 37% samples have Ca2+/Mg2+ ratio values > 2 depicting the influence of silicate weathering in groundwater122. The study area is a metamorphic terrain with a rich deposition of calcsilicate, hornblende, quartz and biotite57. Therefore, Ca2+ and Mg2+ concentrations in groundwater are influenced by carbonate and silicate rock-water interaction as expressed in Eqs. (9)–(12)24,123,124,125.

$${\text{CaCO}}_{3} + {\text{H}}_{2} {\text{CO}}_{3} \leftrightarrow {\text{Ca}}^{2 + } + 2{\text{HCO}}_{3}^{ - } \left( {{\text{calcite}}\;{\text{dissolution}}} \right)$$
(9)
$${\text{CaMg}}({\text{CO}}_{3} )_{2} + 2{\text{H}}^{ + } \leftrightarrow {\text{CaCO}}_{3} + {\text{Mg}}^{2 + } + {\text{H}}_{2} {\text{CO}}_{3} \left( {{\text{dolomite}}\;{\text{dissolution}}} \right)$$
(10)
$${\text{CaSO}}_{4} + {\text{CaMgCO}}_{3} + 6{\text{H}}^{ + } \leftrightarrow {\text{CaCO}}_{3} + {\text{Ca}}^{2 + } + {\text{Mg}}^{2 + } + {\text{SO}}_{4}^{2 - } + {\text{H}}_{2} {\text{CO}}_{3} \left( {{\text{Anhydrite}}\;{\text{and}}\;{\text{dolomite}}\;{\text{dissolution}}} \right)$$
(11)
$$\left( {{\text{Na}},{\text{K}},{\text{Ca}},{\text{Mg}}} \right)\;{\text{silicate}} + {\text{H}}_{2} {\text{CO}}_{3} \to {\text{Na}}^{ + } + {\text{K}}^{ + } + {\text{Ca}}^{2 + } + {\text{Mg}}^{2 + } + {\text{H}}_{4} {\text{SiO}}_{4} + {\text{HCO}}_{3}^{ - } + {\text{Clay}}$$
(12)

Since NO3 has a strong positive loading with Ca2+ and Mg2+, it exhibits a significant positive correlation with TH (r2 = 0.7247) (Fig. 6c). Water hardness is attributed to the elevated concentrations of dissolved alkaline earth elements (Ca2+ and Mg2+) in the aquifer system79. The scatter plots of NO3 with Na+ (r2 = 0.085) (Fig. 6f) and K+ (r2 = 0.0627) (Fig. 6g) signify a very weak positive relationship and suggest that the anthropogenic inputs are not the only primary source of alkali ions contents in groundwater. The bivariate plot of Na+ + K+ versus TZ+ (Fig. S3b) depicts that the entire groundwater samples fall below the 1:1 aquiline. This indicates the weathering effect of silicate minerals besides the anthropogenic impacts, such as the application of NPK fertilizers and discharge of untreated sewerage water on the open ground, which elevate the Na+ and K+ concentrations in groundwater11,27,126,127. In the study area, albite, microcline and alunite dissolution are the key sources of Na+ and K+ ions through rock-water interactions, as shown in Eqs. (13)–(15).

$$\mathop {2{\text{NaAlSi}}_{3} {\text{O}}_{8} }\limits_{{({\text{Albite}})}} + 2{\text{CO}}_{2} + 11{\text{H}}_{2} {\text{O}} \to \mathop {{\text{Al}}_{2} {\text{Si}}_{2} {\text{O}}_{5} \left( {{\text{OH}}} \right)_{4} }\limits_{{({\text{Kaolinite}})}} + 4{\text{H}}_{4} {\text{SiO}}_{4} + 2{\text{Na}}^{ + } + 2{\text{HCO}}_{3}^{ - }$$
(13)
$$\mathop {2{\text{KAlSi}}_{3} {\text{O}}_{8} }\limits_{{({\text{Microcline}})}} + 2{\text{CO}}_{2} + 11{\text{H}}_{2} {\text{O}} \to \mathop {{\text{Al}}_{2} {\text{Si}}_{2} {\text{O}}_{5} \left( {{\text{OH}}} \right)_{4} }\limits_{{({\text{Kaolinite}})}} + 4{\text{H}}_{4} {\text{SiO}}_{4} + 2{\text{K}}^{ + } + 2{\text{HCO}}_{3}^{ - }$$
(14)
$$\mathop {{\text{KAl}}_{3} ({\text{SO}}_{4} )_{2} \left( {{\text{OH}}} \right)_{6} }\limits_{{({\text{Alunite}})}} + 3{\text{CO}}_{2} + {\text{H}}_{2} {\text{O}} \to \mathop {3{\text{Al}}\left( {{\text{OH}}} \right)_{3} }\limits_{{({\text{Gibbsite}})}} + 3{\text{HCO}}_{3}^{ - } + {\text{K}}^{ + } + 2{\text{SO}}_{4}^{2 - }$$
(15)

The scatter plot of NO3 versus HCO3 shows the least positive loading (r2 = 0.012) among the anions (Fig. 6h). This relationship suggests that the HCO3 does not exhibit much variation with increasing or decreasing NO3 concentrations. The fact that HCO3 ions are the dominant anions in the groundwater samples confirms that its primary source is possibly carbonate and silicate weathering26,76,122,128, as shown in Eqs. (9)–(12).

In NO3 versus Cl plot (Fig. 6i), their positive correlation (r2 = 0.5943) implies a common source, such as a combination of oxidation of animal and human waste44, application of manure and nitrogenous fertilizers129, septic tank seepages130, agricultural runoff131, etc. Similar findings are reported in the semiarid regions of many Indian States, such as Punjab24, Rajasthan132, Andhra Pradesh133, and Telangana134.

Figure 6j depicts the weak positive loading between NO3 and SO42− (r2 = 0.1622) due to two separate sets of NO3 and SO42− concentrations in the groundwater samples. The samples having low or high NO3 levels have both low and high SO4 concentrations, thus neglecting the influence of the anthropogenic activities on SO42−.

The plot of Ca2+ versus SO42− (Fig. S3c) is meant to identify the minerals that contribute to higher amount of Ca2+ and SO42− ions in groundwater76. Majority of the samples (92.6%) are below the equiline (1:1), indicating that the role of gypsum (CaSO4·2H2O) dissolution is insignificant. The remaining samples (7.4%) falling along the equiline depict the dissolution of anhydrite (CaSO4) mineral in the groundwater135,136. The gypsum precipitation in the groundwater occurs through direct hydration of anhydrite and dissolution of calcium-bearing minerals oxidized with sulphate and hydronium ions137, as expressed in Eqs. (16) and (17). Hence, the weak positive correlation between Ca2+ and SO42− (r2 = 0.197) (Fig. S3c) suggests that the limited concentrations of Ca2+ ions in the groundwater may be due to the precipitation of gypsum138. If the study area lacks gypsum mineral, then the biologically oxidized sulphur containing compounds deposited by the rainwater and nitrogen compounds in the soil leach down to groundwater as SO42− and NO3 ions139. Thus, the positive regression line between NO3 and SO42− (y = 0.3084x + 16.957) (Fig. 6j) is found in the groundwater samples of the study area. Karunanidhi27 report similar findings on the positive relationship between NO3 and SO42− in the groundwater samples of Tiruppur region, India. Moreover, the dissolution of alunite [KAl3(SO4)2(OH)6], as expressed in Eq. (15), will also contribute to the SO42− ions in groundwater.

$${\text{CaSO}}_{4} + 2{\text{H}}_{2} {\text{O}} \to {\text{CaSO}}_{4} \cdot 2{\text{H}}_{2} {\text{O}}$$
(16)
$${\text{CaCO}}_{3} + 2{\text{H}}^{ + } + {\text{SO}}_{4}^{2 - } + {\text{H}}_{2} {\text{O}} \to {\text{CaSO}}_{4} \cdot 2{\text{H}}_{2} {\text{O}} + {\text{CO}}_{2}$$
(17)

Source apportionment and geochemical relationship of F with other parameters

Normally, high pH in groundwater depicts its alkaline nature, resulting in elevated concentrations of HCO3 and high hydroxyl (OH) ions (Eq. 18, Tables 2 and S2, Fig. S2). A fairly positive relationship between pH and F (r2 = 0.2607; Fig. 7a) indicates that the alkaline water favours dissolution and mobilization of F bearing minerals in groundwater140. The weathering processes of fluoride-bearing rocks to replace F ions with OH ions in the lattices of different minerals, namely muscovite, biotite, amphibole, and hornblende, has enriched the F concentrations in groundwater. Xiao141 and Karunanidhi142 express the displacement mechanism of F ions by OH ions in the muscovite, biotite, and hornblende minerals as follows (Eqs. 1921).

$${\text{HCO}}_{3}^{ - } + {\text{H}}_{2} {\text{O}} = {\text{H}}_{2} {\text{CO}}_{3} + {\text{OH}}^{ - }$$
(18)
$${\text{KAl}}_{2} \left[ {{\text{AlSi}}_{3} {\text{O}}_{10} } \right]{\text{F}}_{2} + 2{\text{OH}}^{ - } = {\text{KAl}}_{2} \left[ {{\text{AlSi}}_{3} {\text{O}}_{10} } \right]\left( {{\text{OH}}} \right)_{2} + 2{\text{F}}^{ - }$$
(19)
$${\text{KMg}}_{3} \left[ {{\text{AlSi}}_{3} {\text{O}}_{10} } \right]{\text{F}}_{2} + 2{\text{OH}}^{ - } = {\text{KMg}}_{3} \left[ {{\text{AlSi}}_{3} {\text{O}}_{10} } \right]\left( {{\text{OH}}} \right)_{2} + 2{\text{F}}^{ - }$$
(20)
$${\text{NaCa}}_{2} ({\text{Mg}},{\text{Fe}},{\text{Al}})_{5} ({\text{Al}},{\text{Si}})_{8} {\text{O}}_{22} {\text{F}}_{2} + 2{\text{OH}}^{ - } \to {\text{NaCa}}_{2} ({\text{Mg}},{\text{Fe}},{\text{Al}})_{5} ({\text{Al}},{\text{Si}})_{8} {\text{O}}_{22} \left( {{\text{OH}}} \right)_{2} + 2{\text{F}}^{ - }$$
(21)
Figure 7
figure 7

Source apportionment of F is carried out with the help of such interactions.

Scatter plot correlations between F and (a) pH, (b) EC, (c) TH, (d) Ca2+, (e) Mg2+, (f) Na+, (g) K+, (h) HCO3, (i) Cl, (j) NO3, (k) SO42− in groundwater samples: Each plot signifies the relationship of F with a particular physicochemical parameter to understand their geochemical interaction.

Figure 7b shows a negative trend between the EC and F (r2 = 0.0118; y = -147.3x + 1079), indicating no major influence of EC on F ion concentrations. A rather weak, but negative relationship of F with Ca2+ (r2 = 0.1262, y = -31.957x + 97.575) and Mg2+ (r2 = 0.0922, y = -6.8172x + 27.41) indicates decreasing concentrations of Ca2+ and Mg2+ ions in groundwater with increasing F content (Fig. 7d,e). Various workers report similar findings between F versus Ca2+ elsewhere53,110. The excess concentrations of HCO3 with high pH contribute to the alkaline water, thus favouring the dissolution of fluorite (CaF2) in groundwater due to precipitation of CaCO3143, as shown in Eq. (22).

$${\text{CaF}}_{2} + 2{\text{HCO}}_{3}^{ - } = {\text{CaCO}}_{3} + 2{\text{F}}^{ - } + {\text{H}}_{2} {\text{O}} + {\text{CO}}_{2}$$
(22)

Jack suggests that the rock-water interaction of fluoride-bearing minerals from a recharge area through the facture zone would precipitate the Ca2+ and Mg2+ ions as calcite, Mg-Calcite dolomite, and dolomite fluorite, respectively, along the groundwater flow path to a discharge area144. Hem states that because Ca2+ and Mg2+ ions are divalent cations with similar properties, they possess the same stability with other ion pairs (SO42−, CO32− and HCO3) and contribute similarly to water hardness145. Thus, the inverse relationship between TH and F (Fig. 7c) is due to decreased Ca2+ and Mg2+ ion concentrations or precipitation of calcium carbonate and Mg-calcite dolomite causing enhanced solubility of fluoride-bearing minerals in the study area139,146.

While examining the role of Na+, it is found that the Na+/Ca2+ ratio helps in understanding the probable reason for lowering of Ca2+ activity in groundwater144. Around 37% of water samples have Na+/Ca2+ ratio > 1, indicating that evapotranspiration is possibly affecting the Ca2+ activity by precipitating it and increasing the Na+ concentrations, thus favouring the enrichment of F content in groundwater (Fig. S3d). The study area is a semiarid region characterized by drier climatic conditions where the dissolved constituents are readily concentrated and precipitated by evaporation, thereby leading to groundwater salinity147,148. The remaining 63% groundwater samples show Na+/Ca2+ ratio < 1, which depicts that rock-water interaction is another key contributing factor of generation of Ca2+ and F ions due to the dissolution of fluorite minerals in the groundwater. However, Ca2+ ions subsequently react with NaHCO3 to form CaCO3 precipitation (Eq. 23). In a similar line, Arveti99 report that high F content in groundwater is directly related to the dissolution of fluoride enriched minerals due to prolonged residence time of water due to physiographic conditions or low hydraulic conductivity in aquifers providing a longer contact period. The plot Na+ versus F (r2 = 0.1475) with a positive slope (y = 28.723x + 38.045) indicates gradual increase of F concentrations with elevated Na+ content in groundwater (Fig. 7f). The higher concentrations of NaHCO3 or Na+ ions with alkaline pH in groundwater allows dissolution of F ions from fluorite (CaF2) through rock–water interaction143,149 (Eq. 23).

$${\text{CaF}}_{2} + 2{\text{NaHCO}}_{3} = {\text{CaCO}}_{3} + 2{\text{Na}}^{ + } + 2{\text{F}}^{ - } + {\text{H}}_{2} {\text{O}} + {\text{CO}}_{2}$$
(23)

In Fig. 7g, the plot K+ versus F divulges that there is no significant positive or negative relationship between them. The flat linear regression (r2 = 0.0001) indicates that the K+ does not have much influence on the fluoride mineralization in groundwater. The orthoclase feldspar (KAlSiO3O8) is generally resistant to attack by water, but apparently gets altered to silica, clay, and K+ ions145. In the study area, rapid precipitation of alunite occurs in the aquifers due to the high degree of stability of potassium-bearing alumino-silicate minerals resulting in low content of K+ in groundwater.

The positive trend between HCO3 and F (r2 = 0.1108; y = 61.992x + 239.04) divulges that the increase in HCO3 content supports the dissolution of F bearing minerals in groundwater (Fig. 7h). However, few samples have low HCO3 concentrations with high F values which indicates that the F enrichment in groundwater is affected by a combination of processes, such as evapotranspiration and calcite precipitation150. The HCO3/Ca2+ratio predicts the likelihood of F enrichment in groundwater140. About 85% of samples show HCO3/Ca2+ratio > 1 (Fig. S3e), signifying that groundwater hydrological conditions are still favourable for further enrichment of fluoride minerals in the study area with their saturation index prevailing in the order of -2.66 to -0.68 (undersaturated condition).

The application of phosphatic and chloride containing fertilizers are the main anthropogenic sources of high F, NO3 and Cl contamination in the groundwater24,150. Figure 7i,j shows inverse relationship of F with Cl (r2 = 0.038, y = -36.228x + 107.74) and NO3 (r2 = 0.2135, y = -31.485x + 68.551), respectively. These plots signify that F contamination in groundwater is from a different source than that of Cl and NO3; thus, the role of agricultural inputs for F generation is neglected. In some cases, when the redox potential falls below a certain value in groundwater, the denitrification process of NO3 by the nitrate-reducing bacteria, accompanied by increased pH value, enhances the precipitation of Ca2+ resulting in the high F and HCO3 concentrations in water (Eq. 24)139.

The inverse correlation between SO42− and F (r2 = 0.0456; y =  − 11.144x + 39.442) indicates two different sets of SO42− and F contents in the groundwater samples (Fig. 7k). The samples having low or high F levels have both low and high SO4 concentrations, thus neglecting the influence of anthropogenic activities. In groundwater, when the redox potential is below a specific value due to high evapotranspiration, sulphate-reducing bacteria initiate desulphurisation process that results in the loss of SO42− ions (Eq. 25). Further, the desulphurisation process raises the pH value, thus favouring the fluorite solubility leading to the high concentrations of F and HCO3 ions and precipitation of Ca2+ ions as CaCO3 in groundwater139. Many researchers have observed similar relationship between SO42− versus F elsewhere144,150,151, because the decrease in solubility of fluorite minerals is affected by the presence of SO42− ions in groundwater.

$$2{\text{NO}}_{3} \to \mathop {\mathop {2{\text{HO}}}\limits_{ \downarrow } }\limits_{{{\text{NH}}_{3} }} - {\text{N}} = {\text{O}} \to {\text{HO}} - {\text{N}} = {\text{N}} - \mathop {\mathop {{\text{OH}}}\limits_{ \searrow } \to \mathop {N_{2} }\limits_{ \nearrow } }\limits_{{{\text{N}}_{2} {\text{O}}}}$$
(24)
$${\text{SO}}_{4}^{2 - } + {\text{CH}}_{4} \to {\text{HS}}^{ - } + {\text{HCO}}_{3}^{ - } + {\text{H}}_{2} {\text{O}}$$
(25)

Chemometric analysis

Principal component analysis

Principal component analysis (PCA) is applied on the 13 chemical parameters to extract the significant principal components (PCs) that define the hydrogeochemistry in the study area and help in identifying the probable sources of these parameters in groundwater. A scree plot is generated to determine the eigenvalues of the PCs using the varimax rotation method. Three PCs were considered as significant from the entire extracted PCs whose eigenvalues are greater than 1. The eigenvalue of PC1, PC2 and PC3 are 57.60, 18.60 and 9.90, respectively, and their cumulative variance is 86.10% of all analyzed parameters (Table 5). The significant PCs having parameters loading scores of > 0.75 (strong, marked bold) and between 0.50 and 0.75 (moderate, marked bold with italics) are considered for the PCA interpretation. The first principal component (PC1) that explains 57.60% of the cumulative variance shows strong positive loading on EC, TDS, TH, Ca2+, Mg2+, Cl and NO3 and a strong inverse relationship with pH (Table 5).

Table 5 Rotated varimax component matrix of the analysed groundwater samples around village Supebeda in Chhattisgarh State, India.

The loading TH (0.98) is directly related to Ca2+ (0.96) and Mg2+ (0.86) scores that indicate that water hardness is influenced by the alkaline earths concentrations in aquifers77,104. The weathering and dissolution of carbonate (calcite and dolomite) and silicate minerals through rock-water interaction are the probable sources of Ca2+ and Mg2+ in groundwater, which is also supported by Ca2+/Mg2+ ratio24. The weak loading of alkalis (Na+: 0.45 and K+: 0.18) with respect to alkaline earths (Ca2+ and Mg2+) supports the cation ion exchange process in groundwater79.

Both Cl (0.93) and high loading of NO3 (0.90) indicate the effect of agrochemicals and domestic sewage in groundwater29,152. The application of chemical fertilizers, namely anhydrous ammonium chloride, ammonium nitrate and urea containing inorganic chlorine and nitrogen, is a matter of concern11. The inverse loading of pH (-0.92) is due to the oxidation of dissolved CO2 and organic matter forming carbonic acids, thereby releasing free H+ ions153. The inorganic chlorine and nitrogen react with H+ ions rapidly to form HCl and HNO3, which decrease pH in groundwater. The high scores of EC (0.89) and TDS (0.89) are due to the elevated concentrations of Ca2+, Mg2+, Cl and NO3 ions, which enhance the mineralization of groundwater in the study area. Therefore, PC1 is controlled by lithogenic (Ca2+ and Mg2+) and anthropogenic (Cl and NO3) factors.

The second principal component (PC2) explains 18.60% of the total variance. It is positively weighed on Na+ (0.85) and F (0.77), moderately weighed on HCO3 (0.74) and has insignificant loading on Ca2+ (0.10) indicating lithogenic sources of these elements (Table 5). PC2 indicates that the dissolution of fluoride-bearing minerals is influenced by the elevated concentrations of Na+ and HCO3 or NaHCO3 in the aquifer system. On the other hand, the weak correlation of Ca2+ with F (Fig. 7d) suggests that high Ca2+ content in groundwater inhibits fluoride mineralization at alkaline pH25,154,155). Therefore, PC2 deals with fluoride dissolution through rock-water interaction.

Lastly, in the principal component 3 (PC3), a variance of 9.90% depicts positive correlation with K+ (high: 0.79) and HCO3 (moderate: 0.50), and negative loading on SO42− (moderate: − 0.59) (Table 5). The main sources of K+ and HCO3 are the weathering of silicate, muscovite, biotite, and microcline minerals found in the study area. The negative score of SO42− is due to the leaching of inorganic sulphides present in the sediments through percolating water, weathering of pyrite-sulphides bearing minerals, namely pyroxene, amphiboles, magnetite and olivine156 and biological oxidation of sulphur containing compounds in soil139. The oxidation of these minerals present in the soil profile or subsurface layers is operated through oxygen transport, viz., convection process and direct exposure of air, because of lowering of groundwater levels through evapotranspiration and groundwater extraction157. Further, the inverse correlations of SO42− with K+ and HCO3 reflect the different minerals sources contributing to these ions in the aquifer system. The concentrations of K+, HCO3 and SO42− are well within the acceptable limits or guideline values of BIS46 and WHO45, thus indicating geogenic inputs.

Cluster analysis

Cluster analysis (CA) is employed on the 27 groundwater samples to create different clusters by grouping similar samples in the form of a dendrogram. The samples grouped in each cluster are marked by certain specific parameters controlling them. Therefore, the variation in the clusters can be identified by computing the average value of each parameter of the sample(s) within a cluster to assess the specific tracers for each cluster82,86. Figure 8 depicts three significant clusters [(Dlimk/Dmax) * 100 < 105] from the dendrogram. Table 6 provides the average values of the groundwater parameters for each cluster. Cluster 1 (C1) is formed by the largest number of samples (G1, G4, G8, G9, G12, G20, G14, G21, G2, G3, G5, G6, G7, G10, G11, G13, G15) with highest values of pH and F, higher values of Na+ and HCO3 and lowest value of Ca2+ that indicate fluoride enrichment. The average values of the parameters belonging to C1 are below their respective standard limits of BIS46 and WHO45, except for F (1.14) (Table 6). Thus, the groundwater quality of C1 is influenced by the dissolution of fluoride-bearing minerals and fits well with PC2.

Figure 8
figure 8

Dendrogram of groundwater sampling locations around village Supebeda in Chhattisgarh State, India: Three different clusters (C1, C2, and C3) are identified by Ward's method and the Euclidean distance to determine the similarity/dissimilarity. The relatively homogenous samples are grouped in each cluster and marked by certain specific parameters controlling them. In the y-axis, (Dlimk/Dmax) * 100 represents the quotient between the linkage distances for a particular case divided by the maximal linkage distance. The quotient is then multiplied by 100 to standardize the linkage distance represented by the y-axis.

Table 6 Average values of the physicochemical parameters for each cluster.

Cluster 2 (C2) denotes the higher values of TDS, TH, Ca2+, K+, SO42−and NO3 and the lowest value of F. The average values of TDS (627.47), TH (316.43), Ca2+ (88.29) and NO3 (65.11) are above the acceptable limits of BIS46 due to their excess concentrations in samples G19, G22, G23, G24, G26, and G27 (Tables S2 and Table 6). The lowest value of F (0.53) in C2 among the three significant clusters are due to only one sample (G22) that just touches the BIS46 acceptable limit of F (Table S2). The groundwater samples (G16, G23, G19, G22, G26, G24, G27) that represent the C2 have K+ and SO42− concentrations below their respective guideline values of WHO45 and BIS46. Therefore, C2 is influenced by both geogenic and anthropogenic factors.

Finally, C3 is the smallest cluster (G17,G18,G25) and is marked by the highest values of EC, TDS, TH, Ca2+, Mg2+, Na+, K+, HCO3, Cl, SO42− and NO3 and the lowest value of pH (Table 6). The average values of EC (2939.67), TDS (1881.39), TH (686.67), Ca2+ (192.67), Mg2+ (49.20), Cl (355.03) and NO3(108.27) are above their respective guideline or acceptable limits of BIS46 and WHO45, except for Na+ (140.57), K+ (2.60), HCO3 (421.00) and F (0.73), due to their elevated contents in sample numbers G17,G18 and G25 that decrease the pH in groundwater. On the other hand, only sample G18 has excess concentrations of HCO3 and F above their acceptable limits defined by BIS46 (Table S2). Therefore, the specific parameters that majorly influence the C3 are EC, TDS, TH, Ca2+, Mg2+, Na+, K+, Cl and NO3 that indicate geogenic and anthropogenic inputs enhancing the mineralization of groundwater. Finally, C2 and C3 correspond to the combination of PC1 and PC3.

Conclusions

This paper highlights the non-carcinogenic human health risk assessment (HHRA) of NO3 and F contamination in groundwater on four different age groups (infants, children, teens and adult) through ingestion and dermal contact using deterministic and probabilistic approaches, source apportionment of NO3 and F with multiple parameters and chemometric modelling to extract the latent factors controlling the groundwater chemistry. Results of the deterministic and probabilistic hazard quotients (\(HQ\)) of nitrate (\(HQ_{{{\text{NO}}_{3}^{ - } }}\)) and fluoride (\(HQ_{{{\text{F}}^{ - } }}\)) signify that the ingestion pathway has the potential non-carcinogenic health implications on all target populations. The deterministic results of the risk certainty levels (RCL) of the hazard index (\(HI\)) above unity for nitrate (\(HI_{{{\text{NO}}_{3}^{ - } }}\)) stand at infants (51.85%) > children (48.15%) > teens (40.74%) = adults (40.74%) and for fluoride (\(HI_{{{\text{F}}^{ - } }}\)) at infants (66.67%) > children (37.04%) > adults (33.33%) > teens (25.23%). However, the probabilistically calculated RCL health risks in the subpopulation groups are in the order of infants (\(HI_{{{\text{NO}}_{3}^{ - } }}\) = 34.03% and \(HI_{{{\text{F}}^{ - } }}\) = 24.17%) > children (\(HI_{{{\text{NO}}_{3}^{ - } }}\) = 23.01% and \(HI_{{{\text{F}}^{ - } }}\) = 10.56%) > teens (\(HI_{{{\text{NO}}_{3}^{ - } }}\) = 13.17% and \(HI_{{{\text{F}}^{ - } }}\) = 2.00%) > adults (\(HI_{{{\text{NO}}_{3}^{ - } }}\) = 11.62% and \(HI_{{{\text{F}}^{ - } }}\) = 1.25%). These figures reveal that there exist higher degrees of potential human health risks in all the subpopulation groups in the deterministic outputs compared to those of the probabilistic model. Field observations do not support deterministic conclusions, but they do approve the probabilistic RCL values. This may be because the deterministic estimation is based on the assumption of an extreme (single point) value for all input variables individually at different concentration levels of the indicator parameters, thus possibly leading to overestimation of the output results since the extreme value may not represent the actual field conditions. Also, since the deterministic approach does not have any provision for validation of its output results, the analysis coming out of it is speculative by nature. On the contrary, the probabilistic approach provides options to choose from a range of values depending on the most likelihood field conditions besides a provision for sensitivity analysis, which enables validation of the input variables affecting the output results among the various exposure pathways. Due to all these considerations, this study concludes that probabilistic modelling is superior to deterministic approaches in human health risk assessment.

Strong positive correlation of scatter plots between NO3 with multiple parameters (EC, TH, Ca2+, Mg2+ and Cl) indicate anthropogenic inputs, such as domestic sewage, agricultural runoff, oxidation of poultry wastes, etc. Prolonged application of N fertilizers has developed cation exchange processes between NH3 and Ca2+ and Mg2+ enhancing the mineralization of Ca2+ and Mg2+ in groundwater, thus leading to water hardness and elevated NO3 concentrations. The positive regression lines between F and pH, Na+ and HCO3, respectively, infer that the alkaline pH with higher concentrations of NaHCO3 or Na+ or HCO3 ions in groundwater allows dissolution of fluoride-bearing rocks, such as muscovite, biotite, amphibole, fluorite, and hornblende through rock-water interaction. Therefore, the fact that about 85% of samples show HCO3/Ca2+ ratio > 1 indicates that there exist favourable groundwater conditions for further enrichment of fluoride minerals in the study area. This finding certainly shall be detrimental to the human health risks, especially of infants and children, in the long run, which is a matter of great concern for the entire study area. Chemometric modelling confirms that Ca2+, Mg2+, HCO3, F and SO42− are derived from geogenic sources, Cl and NO3 from anthropogenic inputs and Na+ and K+ from mixed factors. Further, integration of extracted principal components (PCs) with each significant cluster enables prediction of the latent parameters influencing the sampling locations and confirmation of the various sources.

The study area needs clean drinking water free from NO3 and F for better human health. Based on the unique findings of the present work, socio-enviro conditions and hydrogeological setup, treatment of groundwater through various membrane techniques (reverse osmosis and electrodialysis), ion exchange, adsorption, coagulation, and precipitation processes are highly recommended prior to human consumption. Also, since literacy rate in the area is about 50%, effort needs to be made for mass awareness through various IEC (information, education, and communication) techniques to apprise people of the local groundwater conditions and what is best for their longevity. Further, to tackle similar problems elsewhere in the world, the evaluation of HHRA must be carried out both deterministically and probabilistically to get a holistic picture of groundwater vulnerability. Source apportionment of the contaminants too must be conducted with the help of the chemometric techniques for better human judgement.