Uncovering the environmental conditions required for Phyllachora maydis infection and tar spot development on corn in the United States for use as predictive models for future epidemics

Webster, Richard W.; Nicolli, Camila; Allen, Tom W.; Bish, Mandy D.; Bissonnette, Kaitlyn; Check, Jill C.; Chilvers, Martin I.; Duffeck, Maíra R.; Kleczewski, Nathan; Luis, Jane Marian; Mueller, Brian D.; Paul, Pierce A.; Price, Paul P.; Robertson, Alison E.; Ross, Tiffanna J.; Schmidt, Clarice; Schmidt, Roger; Schmidt, Teryl; Shim, Sujoung; Telenko, Darcy E. P.; Wise, Kiersten; Smith, Damon L.

doi:10.1038/s41598-023-44338-6

Download PDF

Article
Open access
Published: 10 October 2023

Uncovering the environmental conditions required for Phyllachora maydis infection and tar spot development on corn in the United States for use as predictive models for future epidemics

Richard W. Webster¹^na1,
Camila Nicolli²^na1,
Tom W. Allen³,
Mandy D. Bish⁴,
Kaitlyn Bissonnette⁴,
Jill C. Check⁵,
Martin I. Chilvers⁵,
Maíra R. Duffeck⁶,
Nathan Kleczewski⁷,
Jane Marian Luis⁶,
Brian D. Mueller²,
Pierce A. Paul⁶,
Paul P. Price⁸,
Alison E. Robertson⁹,
Tiffanna J. Ross¹⁰,
Clarice Schmidt⁹,
Roger Schmidt¹¹,
Teryl Schmidt²,
Sujoung Shim¹⁰,
Darcy E. P. Telenko¹⁰,
Kiersten Wise¹² &
…
Damon L. Smith²

Scientific Reports volume 13, Article number: 17064 (2023) Cite this article

1964 Accesses
2 Citations
37 Altmetric
Metrics details

Subjects

Abstract

Phyllachora maydis is a fungal pathogen causing tar spot of corn (Zea mays L.), a new and emerging, yield-limiting disease in the United States. Since being first reported in Illinois and Indiana in 2015, P. maydis can now be found across much of the corn growing regions of the United States. Knowledge of the epidemiology of P. maydis is limited but could be useful in developing tar spot prediction tools. The research presented here aims to elucidate the environmental conditions necessary for the development of tar spot in the field and the creation of predictive models to anticipate future tar spot epidemics. Extended periods (30-day windowpanes) of moderate mean ambient temperature (18–23 °C) were most significant for explaining the development of tar spot. Shorter periods (14- to 21-day windowpanes) of moisture (relative humidity, dew point, number of hours with predicted leaf wetness) were negatively correlated with tar spot development. These weather variables were used to develop multiple logistic regression models, an ensembled model, and two machine learning models for the prediction of tar spot development. This work has improved the understanding of P. maydis epidemiology and provided the foundation for the development of a predictive tool for anticipating future tar spot epidemics.

Cicer super-pangenome provides insights into species evolution and agronomic trait loci for crop improvement in chickpea

Article 23 May 2024

Seedling root system adaptation to water availability during maize domestication and global expansion

Article 22 May 2024

Escalating arsenic contamination throughout Chinese soils

Article 14 May 2024

Introduction

Tar spot, caused by Phyllachora maydis, is an emergent disease on corn (Zea mays L.) that can lead to significant yield losses in the United States^1,2. First recorded infecting corn in Mexico as early as 1904³, P. maydis has since been reported throughout much of Latin America⁴. Phyllachora maydis had never been documented in the United States until 2015 when tar spot was observed in multiple fields in northern Indiana and Illinois⁵. Since its arrival in the United States, P. maydis has rapidly spread throughout the midwestern corn belt of the United States (U.S.). It has also been found in Florida, and Ontario, Canada⁶, along with confirmations in Georgia and Virginia^7,8. Under ideal environmental conditions, tar spot can cause severe epidemics. In 2018 alone, tar spot caused estimated yield losses of close to 5 million metric tons, equating to over 680 million USD of economic losses².

Despite P. maydis being recognized as a pathogen of corn for over 100 years, there is still little understanding of its biology and epidemiology. Phyllachora maydis overwinters in the U.S. corn production regions, indicating the pathogen can survive on corn residue, and consequently serves as the inoculum source for at least the next season’s crop^9,10. Monthly temperatures of 17–22 °C, relative humidity greater than 75%, leaf wetness of 7 h per night and 10–20 foggy days per month were reported as the optimal conditions for tar spot development¹¹. Under controlled environments and optimal conditions, inoculation assays demonstrated a latent period of only 15 days, and sporulation occurring approximately 20 days post-inoculation¹².

As P. maydis continues to establish itself across several states in the U.S., an integrated management approach to mitigate the yield losses is needed. Partial genetic resistance for tar spot has been identified in corn germplasm^13,14, but many current commercial corn hybrids are considered highly susceptible. Fungicides are currently the most effective method for reducing tar spot development and yield losses, especially when two or three fungicide classes are used¹⁵.

Predictive modeling has been an effective tool for guiding the optimal timing of fungicide applications. Predictive models have been developed for a number of varying pathosystems including Fusarium head blight of wheat primarily caused by Fusarium graminearum in the U.S.^16,17,18,19, late blight of potato caused by Phytophthora infestans²⁰, Sclerotinia stem rot of soybean caused by Sclerotinia sclerotiorum^21,22, and fire blight of apple and pear caused by Erwinia amylovora^23,24. Many of these models have been integrated into decision support systems, allowing farmers access to the predictive abilities of these models. One successful example is Sporecaster (https://ipcm.wisc.edu/apps/sporecaster/), a decision support system for the prediction of Sclerotinia stem rot of soybean which is publicly available to farmers on smartphones²².

Historically, many predictive models have been developed using either linear or logistic regression models^{19,20,21,22,24,25}, but more recently predictive model development has shifted towards machine learning based analyses^26,27. One commonly used machine learning algorithm is a random forest (RF), which is an ensemble learning method for regressions²⁸. The RF framework utilizes an aggregation of many decision trees allowing improved precision by reducing the amount of variance relative to single decision trees. However, RFs are not capable of being easily interpreted and overfitting can often occur. Another common machine learning algorithm is an artificial neural network²⁹ (ANN), which resembles the interconnectedness and signaling of biological neurons. ANNs are made up of an input layer, consisting of either a single or multiple hidden layers, and an output layer. Within this network there are multiple nodes which are connected to many additional nodes, each of which carries their own associated weight and threshold. If the output from a single node meets a designated threshold, the node is triggered and sends data to the next layer of nodes. If a node does not meet the designated threshold, the node does not send data to the next layer. The downsides to using ANNs are the high level of complexity making it difficult to interpret the models, the considerable amount of computational power required to run these models, and the potential for overfitting. However, machine learning algorithms have been demonstrated to be highly effective at improving predictive capabilities due to their ability to model very complex and non-linear relationships^27,30.

The surge of new and/or re-emerging plant diseases represents one of the biggest challenges to food production in modern agriculture. As global climate change leads to instability of temperatures and changing precipitation patterns, the need to create greater resilience in our crop production systems has become crucial³¹. There are several knowledge gaps regarding tar spot development on corn. The tar spot cycle is not fully understood, specifically, the incubation and latent periods have not been clearly established for P. maydis in production settings, and information on pathogen dispersal is limited. Knowledge of these processes is critical in understanding the polycyclic nature of tar spot epidemics. Therefore, the goals of the current study were to discern the environmental variables that are most important for the development of tar spot, and to develop statistical models for the prediction of future tar spot epidemics in the U.S. that would maximize the precision of in-season management decisions.

Results

Development of training and testing datasets

From this study, a dataset was compiled with 588 observations across the Midwest region of the U.S. including a binary response variable for the increase in P. maydis stroma between two consecutive rating dates. Of these 588 observations, 179 observations were taken from small-plot research trials between 2018 and 2022, and the additional 409 observations were taken from production fields between 2020 and 2022 (Fig. 1). From the combined 588 observations, designated training and testing data sets were created using a 70:30 split by randomly sampling from the full data set (small-plot and commercial fields combined) with replacement in which 70% of the observations were placed in the training data set and 30% of the observations were placed in the testing data set. The training data set included 96 observations where P. maydis developed or increased in severity and 310 observations in which P. maydis did not develop or increase in severity from the previous date. The testing data set included 36 observations in which P. maydis did increase in severity and 146 observations in which P. maydis did not develop or increase in severity. After the development of these two datasets, the training dataset was used for assessment of weather parameters and model development, while the testing data set was used for validation of the developed models from the training data set.

Assessment of weather parameters

Multiple weather variables from the IBM historical weather data service were examined in this study across three levels of moving averages (windowpanes), 30-day (Fig. 2A), 21-day (Fig. 2B), and 14-day (Fig. 2C). By evaluating Pearson correlation coefficients of these moving averages in relation to the delta response variable, the strongest correlations were detected for the 30-day moving averages of the daily minimum ambient temperature and the daily mean ambient temperature with coefficients of − 0.39 and − 0.38, respectively (Fig. 2A, Suppl. Table 1). Within the 21-day moving averages, the two variables with the strongest correlations to P. maydis development or increase were the daily minimum dew point and the daily minimum temperature with coefficients of − 0.36 and − 0.35, respectively (Fig. 2B, Suppl. Table 1). Overall, there were eight 30-day moving average variables, fifteen 21-day moving average variables, and sixteen 14-day moving average variables significantly correlated with P. maydis development or increase in severity (Fig. 2, Suppl. Table 1).

Hock et al.¹¹ previously proposed that relative humidity (RH) levels are extremely important in explaining P. maydis presence or increase in severity, especially at mean RH levels of 75% or greater. To investigate the impact of RH on P. maydis, we evaluated multiple 30-day moving averages of daily total hours of RH levels that ranged from 60 to 95% at 5% increments. Daily total hours of RH greater than 90% was significantly negatively correlated with P. maydis development or severity increase for all three levels of moving averages (Fig. 2, Suppl. Table 1). Furthermore, daily total hours of RH greater than 85% was also significantly negatively correlated with P. maydis development or severity increase for the 21-day and 14-day moving averages. Since these results suggested the importance of the 90% RH threshold, we also investigated the correlation of nighttime total hours of RH greater than 90% between 8 pm and 6 am. Nighttime total hours of RH greater than 90% was more highly negatively correlated in all three levels of moving averages than the originally assessed values (Suppl. Table 1). However, the majority of correlations for the discussed RH variables were negatively correlated with P. maydis development or severity increase (Fig. 2, Suppl. Table 1). Additionally, a daily total wetness hour parameter was assessed serving as a proxy for the presence of leaf wetness. Similar to RH greater than 90% at night, the wetness hour parameter was evaluated as two distinct parameters for the total daily hours with predicted wetness and the total nighttime hours with predicted wetness. Both wetness hour parameters were not significantly correlated with P. maydis development or severity increase in the 30-day moving averages but were significant for both the 21-day and 14-day moving averages. The total nighttime wetness hours parameter was most highly correlated at the 14-day moving average with a correlation coefficient of − 0.17 (P = 0.001, Fig. 2, Suppl. Table 1).

All assessed weather variables were used to create single variable logistic regression (LR) models for explaining P. maydis development and severity increase. These models were then evaluated by comparing Akaike information criterion (AIC) values, C-statistic values, and Hosmer–Lemeshow goodness-of-fit test P-values. From these evaluations, the two best fitting models were the models using 30-day moving averages of either the daily minimum temperature or the daily mean temperature (Fig. 3, Suppl. Table 2). When these two parameters were examined on the predicted risk probability, the inflection points observed for the daily minimum temperature was 15.4 °C and the daily mean temperature was 20.5 °C (Fig. 3).

Many of the moisture related parameters were best fitting when using either the 14-day or 21-day moving averages, compared to the 30-day moving average. For example, the daily minimum DP was observed to be best fitting when using the 21-day moving averages compared to the 14- and 30-day moving averages (Suppl. Table 2). From predicted risk probabilities as a result of the 21-day moving average of the daily minimum DP, an inflection point of 13.1 °C (Fig. 3). In addition, the nighttime total hours of RH greater than 90% and the total night time hours with predicted wetness parameters were best fitting when using the 14-day moving averages. This suggests the importance of lower moisture in the 14 to 21 days prior to P. maydis development or increase in severity (Suppl. Table 2).

Development of predictive models

With many single variable models developed, multi-variable models were then developed using the results of the previous assessments. Since the 30-day moving average of the daily minimum temperature and the daily mean temperature were the two most influential variables (Fig. 2, Suppl. Tables 1 and 2), these two variables were examined more closely. Daily mean temperature was consistently more influential than the daily minimum temperature, and thus this variable was included in all subsequent models, which included many moisture variables. Eight models were chosen based on their input variables and favorable statistics reported above. Four of these models used the 30-day moving averages and included the combination of the daily mean temperature with either the daily total hours of RH greater than 90%, daily total wetness hours, daily minimum dew point depression (DPD), or the daily maximum RH (Suppl. Table 2). After these models were developed, the combination of different moving averages of weather parameters were explored due to the difference in influence as presented by the correlation coefficients and the single variable LRs (Suppl. Table 1 and 2). Four models were identified which all included the 30-day moving average of the daily mean temperature in addition to the 21- and 14-day moving averages of either the daily total hours with RH greater than 90%, the daily total nighttime hours with RH greater than 90%, the daily total wetness hours, or the daily total of nighttime wetness hours.

The corresponding eight LR models (LR1–LR8) were selected to be validated using the previously established testing dataset. The linearized logistic models for these eight LRs are defined as:

$$\begin{aligned} Logit_{LR1} & = 21.92522 - 0.97199\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.25014\left( {30-day\;moving\;average\;of\;daily\;total\;of\;hours\;with\;RH > 90} \right) \\ \end{aligned}$$

(1)

$$\begin{aligned} Logit_{LR2} & = 22.6108 - 0.9880\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;6.0357\left( {30-day\;moving\;average\;of\;daily\;mean\;wetness\;hours} \right) \\ \end{aligned}$$

(2)

$$\begin{aligned} Logit_{LR3} & = 17.7869 - 0.8964\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad + \;0.8157\left( {30-day\;moving\;average\;of\;daily\;minimum\;dew\;point\;depression} \right) \\ \end{aligned}$$

(3)

$$\begin{aligned} Logit_{LR4} & = 32.06987 - 0.89471\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.14373\left( {30-day\;moving\;average\;of\;daily\;maximum\;relative\;humidity} \right) \\ \end{aligned}$$

(4)

$$\begin{aligned} Logit_{LR5} & = 21.21170 - 0.94178\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.23661(21-day\;moving\;average\;of\;daily\;total\;of\;hours\;with\;RH > 90) \\ \end{aligned}$$

(5)

$$\begin{aligned} Logit_{LR6} & = 20.35950 - 0.91093\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.29240\left( {14-day\;moving\;average\;of\;daily\;total\;nighttime\;hours\;with\;RH > 90} \right) \\ \end{aligned}$$

(6)

$$\begin{aligned} Logit_{LR7} & = 22.18844 - 0.96662\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.25134\left( {21-day\;moving\;average\;of\;daily\;total\;of\;wetness\;hours} \right) \\ \end{aligned}$$

(7)

$$\begin{aligned} Logit_{LR8} & = 21.66220 - 0.94504\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.34001\left( {14-day\;moving\;average\;of\;daily\;total\;nighttime\;wetness\;hours} \right). \\ \end{aligned}$$

(8)

The eight models were then assessed based on multiple model quality characteristics including accuracy (%), kappa value, type I error (%), type II error (%), precision (%), and recall (%) when using a 35% risk probability threshold for the prediction of P. maydis development or severity increase. From these evaluations, LR6 had the greatest accuracy (86.8%), greatest kappa value (0.59), greatest recall (69.4%), and the lowest type II error rate (30.6%, Table 1). Furthermore, LR4 had some of the best values for accuracy (86.3%), kappa (0.56), type I error rate (8.2%), and precision (65.7%, Table 1). From these results, a multi-model ensemble was created using LR4 and LR6 (Fig. 4) to more robustly predict the development of P. maydis or increase in its severity. The corresponding multi-model ensemble improved the accuracy (87.4%), kappa value (0.61), and precision (67.6%) while maintaining low type I error rate (8.2%), low type II error rate (30.6%), and high recall (69.4%, Table 1).

Table 1 Model evaluation metrics for eight logistic regression models (LR1-LR8), a multi-model ensemble, a random forest model (RF), and an artificial neural network model (ANN) for predicting the development of tar spot (tar spot) on corn between 2018 and 2022 (n = 182).

Full size table

Furthermore, two machine learning algorithms were developed for the prediction of P. maydis development and increase in severity including a RF model using 500 trees and an ANN model using nine hidden layers. These two machine learning algorithms were assessed similarly to the previous LRs and the multi-model ensemble. The RF consistently outperformed all other models for every metric except for recall, in which it had the lowest observed value (58.3%, Table 1). The corresponding RF had an observed accuracy of 90.1%. The ANN resulted in a model accuracy of (85.7%), relatively high type II error rate (38.9%), and all other metrics were unremarkable (Table 1).

Discussion

Through this study, a deeper epidemiological understanding of P. maydis has been uncovered. The corresponding research suggests the development and increase in P. maydis stroma under field conditions is primarily driven by extended periods of moderate mean ambient temperature (18–23 °C) but tar spot is discouraged by extended periods of high relative humidity (> 90%). Additionally, the development of multiple statistical models offers a tool in production systems to guide fungicide applications to help farmers maximize their return on investment.

From our findings, the weather parameters with the strongest correlations to P. maydis development and severity increase included the 30-day moving averages of either daily minimum temperature or the daily mean temperature (Suppl. Table 1). These two weather parameters were also correlated with P. maydis development and severity increase from the 21-day and 14-day moving averages but were not as highly correlated as the 30-day moving averages (Suppl. Table 1). Specifically, moderately warm air temperatures appear to drive P. maydis development or severity increase, while excessively warm conditions, mean temperatures greater than 23 °C, considerably decreased the probability of P. maydis progression (Fig. 2B). This demonstrates there is a long-term influence of moderate ambient temperature that drives epidemiological processes within the tar spot cycle. These results confirm previous reports by Hock et al.¹¹ that moderate temperatures (17–22 °C) were one of the primary determinants of tar spot progress and severity. Hock et al.¹¹ also reported that during warm seasons with mean ambient temperatures greater than 22 °C, tar spot development was minimal. Moderate mean temperatures could be influencing multiple epidemiological processes within the disease cycle such as germination of initial inoculum, infection of the host, mycelial colonization within host tissue, or the development of the ascomata. Temperature has been well characterized to play an important role in all of these processes in many other fungal organisms²¹. Additional investigations on the effects of temperature on the development of P. maydis on corn are still needed to further elucidate this relationship.

In addition to temperature, moisture weather variables were consistently observed to influence tar spot development, although to a lesser degree than temperature. Specifically, the 21-day moving average of the daily minimum dew point (DP) had the third greatest overall correlation with P. maydis development or severity increase, but this was actually a negative correlation (Fig. 2, Suppl. Table 1). Many additional moisture parameters were significantly correlated with P. maydis development or severity increase across all three levels of moving averages, such as the RH90 and the nighttime RH90 variables as has been previously reported (Fig. 2). Interestingly these moisture variables are negatively correlated with increasing TS severity. Breunig et al.¹² point out that in controlled-environment inoculations frequent misting was only required in the first 5 days after inoculation. After 5 days, misting had to be withheld to produce stroma in these controlled environments. Perhaps leaf wetness is required by the fungus for spore germination and leaf penetration, while excessive moisture later in the infection process can lead to conditions unfavorable for the progression of infection. Regardless, these moisture variables are clearly playing a role in the biological processes driving the development of P. maydis. Variables such as DP and RH are still dependent on temperature. Thus, the relationships presented here demonstrate the complexity that exists between the roles of ambient temperature and moisture on P. maydis development in the physical environment. As Hock et al.¹¹ reported, RH levels greater than 75% were important for tar spot development. We examined the effect of different RH thresholds ranging from 60 up to 95%, and we consistently determined the 90% RH threshold was the most influential of the eight examined RH thresholds for explaining the development and increase in severity of P. maydis, with RH90 being significantly negatively correlated with P. maydis stroma development in all three levels of moving averages (Suppl. Table 1) and resulting in the best fitting models across all RH thresholds (Suppl. Table 2). Our data are like those of Hock et al.¹¹, in that RH was very important in predicting P. maydis development, however, the work presented here suggests that extended periods of high RH are antagonistic to the development of tar spot. These findings seem consistent with Breunig et al.¹² that only intermittent periods or wetness are neded to support the development of tar spot. Thus, the results presented here refine our understanding of the role RH plays in the epidemiology of tar spot in the U.S.

Another important objective of this study was to compare LR models to more modern machine learning algorithms. From our study, a RF machine learning algorithm resulted in one of the best models and had the greatest observed model accuracy of 90.1%. The ANN examined in this study did not result in model accuracy as high as several of the LR models developed here. Two LRs, LR4 and LR 6, were highly accurate. Accuracy was further improved by ensembling these two LR models, with an accuracy estimate of 87.4% while either improving or maintaining all of the additional model assessment characteristics (Table 1). Our analyses demonstrate that machine learning algorithms were slightly more accurate in predicting P. maydis development compared to LRs, but a multi-model ensemble using two of the LRs was still comparable in predicting P. maydis development while balancing all goodness-of-fit statistics. These results confirm previous studies on predicting plant diseases with a high degree of accuracy using different machine learning algorithms^27,32,33. However, in some studies logistic regressions were reported to still be the most accurate at predicting plant diseases³⁴.

Functionally, LR models may be more useful in actual delivery to farmers and can be easily programmed into smartphone application decision support systems (DSS) as has been previously demonstrated²². The models presented here have high levels of potential for improving the application timings of fungicides for managing tar spot. Tar spot may result in severe yield losses; thus, farmers often rely on multiple fungicide applications during the season, which can equate to high economic and environmental costs. The use of these DSS will guide farmers in optimizing fungicide application timing to protect the plant from the pathogen when it is most likely to cause disease. Furthermore, using these DSS can also eliminate unnecessary applications which benefit the farmers by limiting needless economic inputs and decreasing chemical inputs into the environment.

While these identified models are highly accurate at predicting P. maydis development, there is inherit error associated with any model. This error could be explained by variability in environmental conditions which could not be accounted for, the quantity of initial inoculum, the population structure of the pathogen within the field, or resistance levels among site-years. Additionally, there may have been interrater variability associated with disease ratings, especially since these data were collected from multi-state projects with numerous raters across multiple years. However, we minimized this error with the use of standard area diagrams (CPN) that help improve disease severity estimates³⁵.

The current study sheds light on epidemiological processes that are driving the development of a newly emerged pathogen of corn capable of causing severe disruptions to agricultural production. Overall, extended periods (30 days) of cool temperature appears to be most important for tar spot development, with an apparent interaction with shorter periods (14–21 days) of low moisture conditions. The work presented here has also paved the way for the development of a DSS for tar spot. Work is underway to incorporate these models into the Tarspotter DSS (https://ipcm.wisc.edu/apps/tarspotter/) to further improve tar spot prediction and better inform farmers of risk due to plant disease.

Material and methods

Field trials

Small plot field trials were planted between 2018 and 2022 in the following states: Illinois, Iowa, Indiana, Kentucky, Michigan, Missouri, Ohio, and Wisconsin (Fig. 1). Locally adapted hybrids were used at each location. The use of all plant material in this study did not require any specific permissions or licenses. Trials at each location followed locally recommended management practices such as seeding rates, nitrogen fertilization, and herbicides with a small number of trials overhead irrigated. Field trials in 2018 and 2019 included the use of fungicide applications, but only the non-treated plots were considered for this study. All small plot research trials were designed as randomized complete block designs. No fungicide applications were made in trials conducted between 2020 and 2022. Commercial field sites were also assessed across the Midwest U.S. between 2021 and 2022, and these included fields under regional grower conditions. These commercial fields were not designed for research and served as locations of observing disease development. The conducted field trials were performed with permission from local commercial grower collaborators and were compliant with all institutional, national, and international guidelines and legislation. Additional field information is provided in Supplementary Table 3.

Data collection

Phyllachora maydis ratings started at the R1 growth stage (silking) and continued until the R5-R6 growth stage (dent to full maturity). The number of P. maydis severity ratings during this period ranged from two to seven depending on the site-year. In the small plot trials, P. maydis severity was rated by visually assessing the percentage of P. maydis-induced stroma on the ear leaf of five to ten plants per plot (sub-samples) using a standardized rating scale³⁶, and all ratings were averaged across the entire plot. For each rating date in each site-year, all plot severity scores were averaged for a single severity score for that plot. In the commercial fields, five corn plants were randomly selected across the field and were evaluated using the same protocol as described for small plot trials. For each rating date, disease ratings of the five plants were averaged to calculate a single severity score. The compiled database considered for developing the prediction modeling was the average P. maydis severity of the ear leaf for each assessment day. The severity data were aggregated into a single file. For each location, all ratings were aligned in sequential order by date. A binary delta variable was defined as the increase in severity of P. maydis stroma between two sequential rating dates, such that a delta value of 1 was given for any positive increase in P. maydis severity between two sequential dates. If no increase in P. maydis severity was observed, a delta value of 0 was given. Thus, delta values of 1 define P. maydis increase while delta values of 0 define no increase.

Weather data collection

Site-specific weather data was collected using IBM historical weather services. Hourly average weather data was pulled from this service at a resolution of 4 km grids using GPS coordinates for each field location. The collected weather data included the hourly averages of ambient air temperature (AT, °C), relative humidity (RH, %), wind speed (WS, m/s), dew point (DP, °C), and precipitation (mm/hour). From these hourly weather data, dew point depression (DPD, °C) values were calculated for each hour by taking the absolute value of the difference between the AT and DP. A binary wetness hour variable (WH) was calculated by defining a ‘1’ if the DPD was less than or equal to two, predicting the presence of free water on leaf surfaces, and a ‘0’ was defined if the DPD was greater than two^37,38. Additionally, a binary nighttime wetness hour variable was calculated similarly to the previously described wetness hour variable but could only be considered true between the hours of 8 pm and 6 am. All other daytime hours were considered a ‘0’ value. A binary RH variable (RH95) was calculated by defining a ‘1’ if the RH was greater than or equal to 95%, and a ‘0’ was defined if the RH level was less than 95%. Additional binary RH variables (RH90, RH85, RH80, RH75, RH70, RH65, and RH60) were calculated similarly with RH thresholds of either 90%, 85%, 80%, 75%, 70%, 65%, or 60%. A binary nighttime RH90 variable was calculated by defining a ‘1’ if the RH was greater than 90% between the hours of 8 pm and 6 am, and if RH was less than 90% at night or during daytime hours all hours were defined as a ‘0’.

From these hourly weather values, daily mean, minimum, and maximum values were calculated for each of the following variables (AT, RH, WS, DP, and DPD). Daily mean and daily maximum precipitation rates were calculated. Daily totals for WH, nighttime WH, RH90, RH90, RH85, RH80, RH75, RH70, RH65, RH60, and nighttime RH90 were also calculated for each location. After all daily means, minimums, maximums, and totals were calculated, 30-day, 21-day, and 14-day moving averages (window-panes) were calculated for each of the weather variables using the rollmean() function from the ‘zoo’ package in R^39,40. “Window-paning” has been useful in modeling for Fusarium head blight for instance, allowing epidemiologists the ability to find and define specific time-frames for weather variables that are influential in plant disease development⁴¹. Finally, the previously established binary delta values were paired with the 30-day, 21-day, or 14-day moving averages of weather data for the second rating date.

Correlation analysis and logistic regression model development

First the total dataset was split to create training (n = 406) and testing (n = 182) datasets using bootstrapping with replacement. Correlation analyses were performed in R using the rcorr() function from the ‘Hmisc” package⁴². These analyses calculated the Pearson correlation coefficients for the delta values with respect to either 30-day, 21-day, or 14-day moving averages (windowpanes). Significant correlations were determined by a P-value of less than 0.05 (Suppl. Table 1). All LRs were developed with the delta variable as the response variable, as a method to predict the increasing development of tar spot. Single variable LRs were created by using each of the 30-day, 21-day or 14-day moving averages for each of the weather parameters previously described. Additional multi-variable LRs were developed using a combination of these weather variables. All LRs were evaluated by Akaike information criterion (AIC) values, area under the receiver operating characteristics curve (C statistic) using the Cstat() function from the ‘DescTools’ package in R⁴³, and tested by the Hosmer–Lemeshow goodness of fit test (HL test) using the hltest() function from the ‘glmtoolbox’ package in R⁴⁴. Favorable models were determined as having the lowest AIC values, the highest C statistics, and a HL test P-value of greater than 0.05. From these assessments, eight LR models (LR1-LR8) were identified for further evaluation. Additionally, a multi-model ensemble was created by taking the daily average risk probability from the LR4 and LR6 models. An exhaustive approach was performed to examine all other multi-model ensembles, but the ensemble pursued was determined to be the best fitting model.

Evaluation against machine learning algorithms

To evaluate if the developed LR models were adequately predicting the progression of P. maydis on corn plants, the eight best-fitting LR models and ensemble model were compared against two different machine learning algorithms. These included random forests (RF) and artificial neural networks (ANN). From the training dataset, the delta response variable was examined to be explained by all predictor variables using the randomForest() function from the ‘randomForest’ package in R⁴⁵ using a total of 500 trees and all other default hyperparameters were used. The subsequent RF model was then tested on the testing dataset to determine the accuracy of predicting the delta response variable. The training set was also used to create an ANN using the neuralnet() function from the ‘neuralnet’ package in R⁴⁶ using nine hidden layers and all other hyperparameters were set to their default. This ANN was then used to evaluate the ability to predict the delta response variable from within the test dataset. Model fitness metrics compared to the testing data set for the eight LR models, the ensemble model, and the two machine learning models were evaluated for their accuracy (%), kappa values, type I error (%), type II error (%), precision (%), and recall (%). These metrics were evaluated for each model using the confusionMatrix() function from the ‘caret’ package in R⁴⁷.

Data availability

The datasets generated and/or analysed during the current study are not publicly available to preserve farm-level ananonymity but county-level data are available from the corresponding author on reasonable request.

References

Telenko, D. E. et al. How tar spot of corn impacted hybrid yields during the 2018 Midwest epidemic. Crop Prot. Netw. https://doi.org/10.31274/cpn-20190729-002 (2019).
Article Google Scholar
Mueller, D. S. et al. Corn yield loss estimates due to diseases in the United States and Ontario, Canada, from 2016 to 2019. Plant Health Prog. 21, 238–247 (2020).
Article Google Scholar
Maublanc, A. Especes Nouvelles de champignons inferieurs. Bull. de la Societe Phytopathologique Francaise. 20, 72 (1904).
Google Scholar
Hock, J., Kranz, J. & Renfro, B. L. El complejo “mancha de asfalto” de maíz: Su distribucción geográfica, requisitos ambientales e importancia económica en México. Rev Mex Fitopatol 7, 129–135 (1989).
Google Scholar
Ruhl, G. et al. First report of tar spot on corn caused by Phyllachora maydis in the United States. Plant Dis. 100, 1496 (2016).
Article Google Scholar
McCoy, A. G. et al. First report of tar spot on corn (Zea mays) caused by Phyllachora maydis in Florida, Iowa, Michigan, and Wisconsin. Plant Dis. 102, 9 (2018).
Article Google Scholar
Pandey, L. et al. First report of tar spot on corn caused by Phyllachora maydis in Georgia, United States. Plant Dis. 100 (2022).
Corn ipmPIPE. Maps of tar spot. https://corn.ipmpipe.org/tarspot/ (2022).
Kleczewski, N. M., Donnelly, J. & Higgins, R. Phyllachora maydis, causal agent of tar spot on corn, can overwinter in northern Illinois. Plant Health Prog. 20, 178–178 (2019).
Article Google Scholar
Groves, C. L., Kleczewski, N. M., Telenko, D. E. P., Chilvers, M. I. & Smith, D. L. Phyllachora maydis ascospore release and germination from overwintered corn residue. Plant Health Prog. 21, 26–30 (2020).
Article Google Scholar
Hock, J., Kranz, J. & Renfro, B. L. Studies on the epidemiology of the tar spot disease complex of maize in Mexico. Plant Pathol. 44, 490–502 (1995).
Article Google Scholar
Breunig, M., Bittner, R., Dolezal, A., Ramcharan, A. & Bunkers, G. An assay to reliably achieve Tar Spot symptoms on corn in a controlled environment. bioRxiv https://doi.org/10.1101/2023.01.12.523803 (2023).
Article Google Scholar
Yan, S. et al. Association mapping of resistance to tar spot complex in maize. Plant Breed. 141, 745–755 (2022).
Article Google Scholar
Singh, R., Shim, S., Telenko, D. E. P. & Goodwin, S. B. The parental inbred lines of the nested association mapping (NAM) population of corn show sources of resistance to tar spot in northern Indiana. Plant Dis. https://doi.org/10.1094/PDIS-02-22-0314-SC (2022).
Article PubMed Google Scholar
Telenko, D. E. et al. Fungicide efficacy on tar spot and yield of corn in the Midwestern United States. Plant Health Prog. 23, 281–287 (2022).
Article Google Scholar
Shah, D. A., Paul, P. A., De Wolf, E. D. & Madden, L. V. Predicting plant disease epidemics from functionally represented weather series. Philos. Trans. R. Soc. B. 374, 20180273 (2019).
Article CAS Google Scholar
Shah, D. A., De Wolf, E. D., Paul, P. A. & Madden, L. V. Functional data analysis of weather variables linked to Fusarium head blight epidemics in the United States. Phytopathology 109, 96–110 (2019).
Article PubMed CAS Google Scholar
Shah, D. A., De Wolf, E. D., Paul, P. A. & Madden, L. V. Predicting Fusarium head blight epidemics with boosted regression tree. Phytopathology 104, 702–714 (2014).
Article PubMed CAS Google Scholar
Shah, D. A. et al. Predicting Fusarium head blight epidemics with weather-driven pre- and post-anthesis logistic regression model. Phytopathology 103, 906–919 (2013).
Article PubMed CAS Google Scholar
Krause, R. A., Massie, L. B. & Hyre, R. A. Blitecast: A computerized forecast of potato late blight. Plant Dis. Rep. 59, 95–98 (1975).
Google Scholar
Willbur, J. F. et al. Weather-based models for assessing the risk of Sclerotinia sclerotiorum apothecial presence in soybean (Glycine max) fields. Plant Dis. 102, 73–84 (2018).
Article PubMed Google Scholar
Willbur, J. F. et al. Validating Sclerotinia sclerotiorum apothecial models to predict Sclerotinia stem rot in soybean (Glycine max) fields. Plant Dis. 102, 2592–2601 (2018).
Article PubMed CAS Google Scholar
Smith, T. J. A risk assessment model for fire blight of apple and pear. Acta Hortic. 411, 97–104 (1996).
Article Google Scholar
Steiner, P. W. Predicting apple blossom infections by Erwinia amylovora using the MARYBLYT model. Acta Hortic. 273, 139–148 (1990).
Article Google Scholar
Krause, R. A. & Massie, L. B. Predictive systems: Modern approaches to disease control. Annu. Rev. Phytopathol. 13, 31–47 (1975).
Article Google Scholar
Kaundal, R., Kapoor, A. S. & Raghava, G. P. S. Machine learning techniques in disease forecasting: A case study on rice blast prediction. BMC Bioinform. 7, 485 (2006).
Article Google Scholar
Shahoveisi, F., Riahi Manesh, M. & del Río Mendoza, L. E. Modeling risk of Sclerotinia sclerotiorum-induced disease development on canola and dry bean using machine learning algorithms. Sci. Rep. 12, 864 (2022).
Article ADS PubMed PubMed Central CAS Google Scholar
Svetnik, V. et al. Random forest: A classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43, 1947–1958 (2003).
Article PubMed CAS Google Scholar
Hill, T., Marquez, L., O’Connor, M. & Remus, W. Artificial neural network models for forecasting and decision making. Int. J. Forecast. 10, 5–15 (1994).
Article Google Scholar
Barbedo, J. G. Deep learning applied to plant pathology: The problem of data representativeness. Trop. Plant Pathol. 47, 85–94 (2021).
Article Google Scholar
Malhi, G. S., Kaur, M. & Kaushik, P. Impact of climate change on agriculture and its mitigation strategies: A review. Sustainability 13, 1318–1339 (2021).
Article CAS Google Scholar
Ramesh, S. et al. Plant disease detection using machine learning. In Proceedings of the International Conference on Design Innovations for 3C’s Compute Communicate Control (2018).
Ahmed, K., Shahidi, T. R., Irfanul Alam, S. Md. & Momen, S. Rice leaf disease detection using machine learning techniques. In 2019 International Conference on Sustainable Technologies for Industry 4.0 (STI) 1–5 (IEEE, 2019). https://doi.org/10.1109/STI47673.2019.9068096.
Tiwari, D. et al. Potato leaf disease detection using deep learning. In Proceedings of the International Conference on Intelligent Computing and Control Systems (2020).
Bock, C. H., Hotchkiss, M. W. & Wood, B. W. Assessing disease severity: Accuracy and reliability of rater estimates in relation to number of diagrams in a standard area diagram set. Plant Pathol. 65, 261–272 (2016).
Article Google Scholar
Telenko, D. E. et al. Tar spot of corn. Crop Prot. Netw. https://cropprotectionnetwork.org/web-books/tar-spot-of-corn?section=tar-spot-of-corn-preface-and-introduction (2021).
Payne, A. F. & Smith, D. L. Development and evaluation of two pecan scab prediction models. Plant Dis. 96, 1358–1364 (2012).
Article PubMed CAS Google Scholar
Sentelhas, P. C., Monteiro, J. E. B. A. & Gillespie, T. J. Electronic leaf wetness duration sensor: Why it should be painted. Int. J. Biometeorol. 48, 202–205 (2004).
Article ADS PubMed CAS Google Scholar
Zeileis, A. & Grothendieck, G. Zoo: S3 infrastructure for regular and irregular time series. J. Stat. Softw. 14, 1–27 (2005).
Article Google Scholar
R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org (2021).
Kriss, A. B., Paul, P. A. & Madden, L. V. Relationship between yearly fluctuations in Fusarium head blight intensity and environmental variables: A window-pane analysis. Phytopathology 100, 784–797 (2010).
Article PubMed CAS Google Scholar
Harrell Jr., F. Himsc: Harrell miscellaneous. R package version 4.7-2.
Signorell, A. et al. DescTools: Tools for descriptive statistics. R package version 0.99.47. https://cran.r-project.org/package=DescTools (2022).
Vanegas, L., Rondón, L. & Paula, G. glmtoolbox: Set of tools to data analysis using generalized linear models. R package version 0.1.4, https://CRAN.R-project.org/package=glmtoolbox (2022).
Liaw, A. & Wiender, M. Classification and Regression by randomForest. R News. 2, 18–22 (2002).
Google Scholar
Fritsch, S., Guenther, F., & Wright, M. Neuralnet: Training of neural network. v.1.44.2. (2019).
Kuhn, M. caret: Classification and regression training. v.6.0-93. (2022).
Wickham., H., et al. Ggplot2: Elegant graphics for data visualization using the grammar of graphics. R package version 3.4.2, https://cran.r-project.org/web/packages/ggplot2/index.html (2023).
Sievert, C. et al. Plotly: Create interactive web graphics. R package version 4.10.2, https://cran.r-project.org/web/packages/plotly/index.html (2023).

Download references

Acknowledgements

This work was partially supported by the National Predictive Modeling Tool Initiative operating under the auspices of the USDA-ARS; the Wisconsin Corn Promotion Board; the Corn Marketing Program of Michigan; Project GREEEN - Michigan’s plant agriculture initiative; USDA National Institute of Food and Agriculture, Hatch Project #IND00162952; Foundation for Food Agricultural Research -Rapid Outcomes from Agricultural Research (FFAR-ROAR award # 0000000017) grant with matching funds provided by Pioneer, the National Corn Growers Association; The Illinois Corn Growers Association; Indiana Corn Marketing Council; Purdue University, and Hatch Project IOWN03908. The authors thank J. Ravellette and Steven Brand at Purdue University, John Boyse, Adam Byrne and William Widdicombe at Michigan State University, John Shriver and Cody Schneider at Iowa State University, Nolan Anderson, Jesse Gray, and Sean Wood at the University of Kentucky, and Dan Sjarpe at the University of Missouri for assistance with the field trial maintenance.

Author information

These authors contributed equally: Richard W. Webster and Camila Nicolli.

Authors and Affiliations

Department of Plant Pathology, North Dakota State University, Fargo, ND, 58108, USA
Richard W. Webster
Department of Plant Pathology, University of Wisconsin-Madison, Madison, WI, 53706, USA
Camila Nicolli, Brian D. Mueller, Teryl Schmidt & Damon L. Smith
Delta Research and Extension Center, Mississippi State University, Stoneville, MS, 38776, USA
Tom W. Allen
Division of Plant Science and Technology, University of Missouri, Columbia, MO, 65211, USA
Mandy D. Bish & Kaitlyn Bissonnette
Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI, 48824, USA
Jill C. Check & Martin I. Chilvers
Department of Plant Pathology, The Ohio State University, Wooster, OH, 44691, USA
Maíra R. Duffeck, Jane Marian Luis & Pierce A. Paul
Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
Nathan Kleczewski
Macon Ridge Research Station, LSU AgCenter, Winnsboro, LA, 71295, USA
Paul P. Price
Department of Plant Pathology, Entomology, and Microbiology, Iowa State University, Ames, IA, 50011, USA
Alison E. Robertson & Clarice Schmidt
Department of Botany and Plant Pathology, Purdue University, West Lafayette, IN, 47907, USA
Tiffanna J. Ross, Sujoung Shim & Darcy E. P. Telenko
Nutrient and Pest Management Program, University of Wisconsin-Madison, Madison, WI, 53706, USA
Roger Schmidt
Department of Plant Pathology, University of Kentucky, Princeton, KY, 42445, USA
Kiersten Wise

Authors

Richard W. Webster
View author publications
You can also search for this author in PubMed Google Scholar
Camila Nicolli
View author publications
You can also search for this author in PubMed Google Scholar
Tom W. Allen
View author publications
You can also search for this author in PubMed Google Scholar
Mandy D. Bish
View author publications
You can also search for this author in PubMed Google Scholar
Kaitlyn Bissonnette
View author publications
You can also search for this author in PubMed Google Scholar
Jill C. Check
View author publications
You can also search for this author in PubMed Google Scholar
Martin I. Chilvers
View author publications
You can also search for this author in PubMed Google Scholar
Maíra R. Duffeck
View author publications
You can also search for this author in PubMed Google Scholar
Nathan Kleczewski
View author publications
You can also search for this author in PubMed Google Scholar
Jane Marian Luis
View author publications
You can also search for this author in PubMed Google Scholar
Brian D. Mueller
View author publications
You can also search for this author in PubMed Google Scholar
Pierce A. Paul
View author publications
You can also search for this author in PubMed Google Scholar
Paul P. Price
View author publications
You can also search for this author in PubMed Google Scholar
Alison E. Robertson
View author publications
You can also search for this author in PubMed Google Scholar
Tiffanna J. Ross
View author publications
You can also search for this author in PubMed Google Scholar
Clarice Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Roger Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Teryl Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Sujoung Shim
View author publications
You can also search for this author in PubMed Google Scholar
Darcy E. P. Telenko
View author publications
You can also search for this author in PubMed Google Scholar
Kiersten Wise
View author publications
You can also search for this author in PubMed Google Scholar
Damon L. Smith
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.W.W. and C.N. analyzed and co-wrote the manuscript draft. D.L.S. led and coordinated the project and is the corresponding author. All authors collected data and reviewed and edited the iterative manuscript drafts.

Corresponding author

Correspondence to Damon L. Smith.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Table 1.

Supplementary Table 2.

Supplementary Table 3.

Supplementary Legends.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Webster, R.W., Nicolli, C., Allen, T.W. et al. Uncovering the environmental conditions required for Phyllachora maydis infection and tar spot development on corn in the United States for use as predictive models for future epidemics. Sci Rep 13, 17064 (2023). https://doi.org/10.1038/s41598-023-44338-6

Download citation

Received: 18 March 2023
Accepted: 06 October 2023
Published: 10 October 2023
DOI: https://doi.org/10.1038/s41598-023-44338-6

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.