Introduction

Tar spot, caused by Phyllachora maydis, is an emergent disease on corn (Zea mays L.) that can lead to significant yield losses in the United States1,2. First recorded infecting corn in Mexico as early as 19043, P. maydis has since been reported throughout much of Latin America4. Phyllachora maydis had never been documented in the United States until 2015 when tar spot was observed in multiple fields in northern Indiana and Illinois5. Since its arrival in the United States, P. maydis has rapidly spread throughout the midwestern corn belt of the United States (U.S.). It has also been found in Florida, and Ontario, Canada6, along with confirmations in Georgia and Virginia7,8. Under ideal environmental conditions, tar spot can cause severe epidemics. In 2018 alone, tar spot caused estimated yield losses of close to 5 million metric tons, equating to over 680 million USD of economic losses2.

Despite P. maydis being recognized as a pathogen of corn for over 100 years, there is still little understanding of its biology and epidemiology. Phyllachora maydis overwinters in the U.S. corn production regions, indicating the pathogen can survive on corn residue, and consequently serves as the inoculum source for at least the next season’s crop9,10. Monthly temperatures of 17–22 °C, relative humidity greater than 75%, leaf wetness of 7 h per night and 10–20 foggy days per month were reported as the optimal conditions for tar spot development11. Under controlled environments and optimal conditions, inoculation assays demonstrated a latent period of only 15 days, and sporulation occurring approximately 20 days post-inoculation12.

As P. maydis continues to establish itself across several states in the U.S., an integrated management approach to mitigate the yield losses is needed. Partial genetic resistance for tar spot has been identified in corn germplasm13,14, but many current commercial corn hybrids are considered highly susceptible. Fungicides are currently the most effective method for reducing tar spot development and yield losses, especially when two or three fungicide classes are used15.

Predictive modeling has been an effective tool for guiding the optimal timing of fungicide applications. Predictive models have been developed for a number of varying pathosystems including Fusarium head blight of wheat primarily caused by Fusarium graminearum in the U.S.16,17,18,19, late blight of potato caused by Phytophthora infestans20, Sclerotinia stem rot of soybean caused by Sclerotinia sclerotiorum21,22, and fire blight of apple and pear caused by Erwinia amylovora23,24. Many of these models have been integrated into decision support systems, allowing farmers access to the predictive abilities of these models. One successful example is Sporecaster (https://ipcm.wisc.edu/apps/sporecaster/), a decision support system for the prediction of Sclerotinia stem rot of soybean which is publicly available to farmers on smartphones22.

Historically, many predictive models have been developed using either linear or logistic regression models19,20,21,22,24,25, but more recently predictive model development has shifted towards machine learning based analyses26,27. One commonly used machine learning algorithm is a random forest (RF), which is an ensemble learning method for regressions28. The RF framework utilizes an aggregation of many decision trees allowing improved precision by reducing the amount of variance relative to single decision trees. However, RFs are not capable of being easily interpreted and overfitting can often occur. Another common machine learning algorithm is an artificial neural network29 (ANN), which resembles the interconnectedness and signaling of biological neurons. ANNs are made up of an input layer, consisting of either a single or multiple hidden layers, and an output layer. Within this network there are multiple nodes which are connected to many additional nodes, each of which carries their own associated weight and threshold. If the output from a single node meets a designated threshold, the node is triggered and sends data to the next layer of nodes. If a node does not meet the designated threshold, the node does not send data to the next layer. The downsides to using ANNs are the high level of complexity making it difficult to interpret the models, the considerable amount of computational power required to run these models, and the potential for overfitting. However, machine learning algorithms have been demonstrated to be highly effective at improving predictive capabilities due to their ability to model very complex and non-linear relationships27,30.

The surge of new and/or re-emerging plant diseases represents one of the biggest challenges to food production in modern agriculture. As global climate change leads to instability of temperatures and changing precipitation patterns, the need to create greater resilience in our crop production systems has become crucial31. There are several knowledge gaps regarding tar spot development on corn. The tar spot cycle is not fully understood, specifically, the incubation and latent periods have not been clearly established for P. maydis in production settings, and information on pathogen dispersal is limited. Knowledge of these processes is critical in understanding the polycyclic nature of tar spot epidemics. Therefore, the goals of the current study were to discern the environmental variables that are most important for the development of tar spot, and to develop statistical models for the prediction of future tar spot epidemics in the U.S. that would maximize the precision of in-season management decisions.

Results

Development of training and testing datasets

From this study, a dataset was compiled with 588 observations across the Midwest region of the U.S. including a binary response variable for the increase in P. maydis stroma between two consecutive rating dates. Of these 588 observations, 179 observations were taken from small-plot research trials between 2018 and 2022, and the additional 409 observations were taken from production fields between 2020 and 2022 (Fig. 1). From the combined 588 observations, designated training and testing data sets were created using a 70:30 split by randomly sampling from the full data set (small-plot and commercial fields combined) with replacement in which 70% of the observations were placed in the training data set and 30% of the observations were placed in the testing data set. The training data set included 96 observations where P. maydis developed or increased in severity and 310 observations in which P. maydis did not develop or increase in severity from the previous date. The testing data set included 36 observations in which P. maydis did increase in severity and 146 observations in which P. maydis did not develop or increase in severity. After the development of these two datasets, the training dataset was used for assessment of weather parameters and model development, while the testing data set was used for validation of the developed models from the training data set.

Figure 1
figure 1

Map of all field locations where data were recorded and included in this study. The figure was created using the R statistical software40 (v. 4.3.1) and the ggplot2 package48.

Assessment of weather parameters

Multiple weather variables from the IBM historical weather data service were examined in this study across three levels of moving averages (windowpanes), 30-day (Fig. 2A), 21-day (Fig. 2B), and 14-day (Fig. 2C). By evaluating Pearson correlation coefficients of these moving averages in relation to the delta response variable, the strongest correlations were detected for the 30-day moving averages of the daily minimum ambient temperature and the daily mean ambient temperature with coefficients of − 0.39 and − 0.38, respectively (Fig. 2A, Suppl. Table 1). Within the 21-day moving averages, the two variables with the strongest correlations to P. maydis development or increase were the daily minimum dew point and the daily minimum temperature with coefficients of − 0.36 and − 0.35, respectively (Fig. 2B, Suppl. Table 1). Overall, there were eight 30-day moving average variables, fifteen 21-day moving average variables, and sixteen 14-day moving average variables significantly correlated with P. maydis development or increase in severity (Fig. 2, Suppl. Table 1).

Figure 2
figure 2

Pearson correlation matrix of the binary delta variable and 30-day moving averages, 21-day moving averages, and 14-day moving averages. Heatmaps were created using the R statistical software40 (v. 4.3.1) and the plotly package49. Hyperlink can be used to view the interactive figure. https://chart-studio.plotly.com/~richard.webster/1/#plot.

Hock et al.11 previously proposed that relative humidity (RH) levels are extremely important in explaining P. maydis presence or increase in severity, especially at mean RH levels of 75% or greater. To investigate the impact of RH on P. maydis, we evaluated multiple 30-day moving averages of daily total hours of RH levels that ranged from 60 to 95% at 5% increments. Daily total hours of RH greater than 90% was significantly negatively correlated with P. maydis development or severity increase for all three levels of moving averages (Fig. 2, Suppl. Table 1). Furthermore, daily total hours of RH greater than 85% was also significantly negatively correlated with P. maydis development or severity increase for the 21-day and 14-day moving averages. Since these results suggested the importance of the 90% RH threshold, we also investigated the correlation of nighttime total hours of RH greater than 90% between 8 pm and 6 am. Nighttime total hours of RH greater than 90% was more highly negatively correlated in all three levels of moving averages than the originally assessed values (Suppl. Table 1). However, the majority of correlations for the discussed RH variables were negatively correlated with P. maydis development or severity increase (Fig. 2, Suppl. Table 1). Additionally, a daily total wetness hour parameter was assessed serving as a proxy for the presence of leaf wetness. Similar to RH greater than 90% at night, the wetness hour parameter was evaluated as two distinct parameters for the total daily hours with predicted wetness and the total nighttime hours with predicted wetness. Both wetness hour parameters were not significantly correlated with P. maydis development or severity increase in the 30-day moving averages but were significant for both the 21-day and 14-day moving averages. The total nighttime wetness hours parameter was most highly correlated at the 14-day moving average with a correlation coefficient of − 0.17 (P = 0.001, Fig. 2, Suppl. Table 1).

All assessed weather variables were used to create single variable logistic regression (LR) models for explaining P. maydis development and severity increase. These models were then evaluated by comparing Akaike information criterion (AIC) values, C-statistic values, and Hosmer–Lemeshow goodness-of-fit test P-values. From these evaluations, the two best fitting models were the models using 30-day moving averages of either the daily minimum temperature or the daily mean temperature (Fig. 3, Suppl. Table 2). When these two parameters were examined on the predicted risk probability, the inflection points observed for the daily minimum temperature was 15.4 °C and the daily mean temperature was 20.5 °C (Fig. 3).

Figure 3
figure 3

Logistic regression models developed for predicting the development of tar spot caused by Phyllachora maydis. These models predict the risk probability (%) of tar spot developing in relationship with (A) 30-day moving average of daily minimum ambient temperature (°C), (B) 30-day moving average of daily mean ambient temperature (°C), or (C) 21-day moving average of daily minimum dew point (°C).

Many of the moisture related parameters were best fitting when using either the 14-day or 21-day moving averages, compared to the 30-day moving average. For example, the daily minimum DP was observed to be best fitting when using the 21-day moving averages compared to the 14- and 30-day moving averages (Suppl. Table 2). From predicted risk probabilities as a result of the 21-day moving average of the daily minimum DP, an inflection point of 13.1 °C (Fig. 3). In addition, the nighttime total hours of RH greater than 90% and the total night time hours with predicted wetness parameters were best fitting when using the 14-day moving averages. This suggests the importance of lower moisture in the 14 to 21 days prior to P. maydis development or increase in severity (Suppl. Table 2).

Development of predictive models

With many single variable models developed, multi-variable models were then developed using the results of the previous assessments. Since the 30-day moving average of the daily minimum temperature and the daily mean temperature were the two most influential variables (Fig. 2, Suppl. Tables 1 and 2), these two variables were examined more closely. Daily mean temperature was consistently more influential than the daily minimum temperature, and thus this variable was included in all subsequent models, which included many moisture variables. Eight models were chosen based on their input variables and favorable statistics reported above. Four of these models used the 30-day moving averages and included the combination of the daily mean temperature with either the daily total hours of RH greater than 90%, daily total wetness hours, daily minimum dew point depression (DPD), or the daily maximum RH (Suppl. Table 2). After these models were developed, the combination of different moving averages of weather parameters were explored due to the difference in influence as presented by the correlation coefficients and the single variable LRs (Suppl. Table 1 and 2). Four models were identified which all included the 30-day moving average of the daily mean temperature in addition to the 21- and 14-day moving averages of either the daily total hours with RH greater than 90%, the daily total nighttime hours with RH greater than 90%, the daily total wetness hours, or the daily total of nighttime wetness hours.

The corresponding eight LR models (LR1–LR8) were selected to be validated using the previously established testing dataset. The linearized logistic models for these eight LRs are defined as:

$$\begin{aligned} Logit_{LR1} & = 21.92522 - 0.97199\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.25014\left( {30-day\;moving\;average\;of\;daily\;total\;of\;hours\;with\;RH > 90} \right) \\ \end{aligned}$$
(1)
$$\begin{aligned} Logit_{LR2} & = 22.6108 - 0.9880\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;6.0357\left( {30-day\;moving\;average\;of\;daily\;mean\;wetness\;hours} \right) \\ \end{aligned}$$
(2)
$$\begin{aligned} Logit_{LR3} & = 17.7869 - 0.8964\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad + \;0.8157\left( {30-day\;moving\;average\;of\;daily\;minimum\;dew\;point\;depression} \right) \\ \end{aligned}$$
(3)
$$\begin{aligned} Logit_{LR4} & = 32.06987 - 0.89471\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.14373\left( {30-day\;moving\;average\;of\;daily\;maximum\;relative\;humidity} \right) \\ \end{aligned}$$
(4)
$$\begin{aligned} Logit_{LR5} & = 21.21170 - 0.94178\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.23661(21-day\;moving\;average\;of\;daily\;total\;of\;hours\;with\;RH > 90) \\ \end{aligned}$$
(5)
$$\begin{aligned} Logit_{LR6} & = 20.35950 - 0.91093\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.29240\left( {14-day\;moving\;average\;of\;daily\;total\;nighttime\;hours\;with\;RH > 90} \right) \\ \end{aligned}$$
(6)
$$\begin{aligned} Logit_{LR7} & = 22.18844 - 0.96662\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.25134\left( {21-day\;moving\;average\;of\;daily\;total\;of\;wetness\;hours} \right) \\ \end{aligned}$$
(7)
$$\begin{aligned} Logit_{LR8} & = 21.66220 - 0.94504\left( {30-day\;moving\;average\;mean\;temperature} \right) \\ & \quad - \;0.34001\left( {14-day\;moving\;average\;of\;daily\;total\;nighttime\;wetness\;hours} \right). \\ \end{aligned}$$
(8)

The eight models were then assessed based on multiple model quality characteristics including accuracy (%), kappa value, type I error (%), type II error (%), precision (%), and recall (%) when using a 35% risk probability threshold for the prediction of P. maydis development or severity increase. From these evaluations, LR6 had the greatest accuracy (86.8%), greatest kappa value (0.59), greatest recall (69.4%), and the lowest type II error rate (30.6%, Table 1). Furthermore, LR4 had some of the best values for accuracy (86.3%), kappa (0.56), type I error rate (8.2%), and precision (65.7%, Table 1). From these results, a multi-model ensemble was created using LR4 and LR6 (Fig. 4) to more robustly predict the development of P. maydis or increase in its severity. The corresponding multi-model ensemble improved the accuracy (87.4%), kappa value (0.61), and precision (67.6%) while maintaining low type I error rate (8.2%), low type II error rate (30.6%), and high recall (69.4%, Table 1).

Table 1 Model evaluation metrics for eight logistic regression models (LR1-LR8), a multi-model ensemble, a random forest model (RF), and an artificial neural network model (ANN) for predicting the development of tar spot (tar spot) on corn between 2018 and 2022 (n = 182).
Figure 4
figure 4

Two dimensional surfaces of logistic regression models developed for predicting the development of tar spot caused by Phyllachora maydis. (A) Logistic regression 4: Risk probability (%) of tar spot with 30-day moving average of daily mean ambient temperature (°C) and either 30-day moving average of daily maximum relative humidity or (B) Logistic regression 6: 14-day moving average of total nighttime hours with relative humidity > 90%.

Furthermore, two machine learning algorithms were developed for the prediction of P. maydis development and increase in severity including a RF model using 500 trees and an ANN model using nine hidden layers. These two machine learning algorithms were assessed similarly to the previous LRs and the multi-model ensemble. The RF consistently outperformed all other models for every metric except for recall, in which it had the lowest observed value (58.3%, Table 1). The corresponding RF had an observed accuracy of 90.1%. The ANN resulted in a model accuracy of (85.7%), relatively high type II error rate (38.9%), and all other metrics were unremarkable (Table 1).

Discussion

Through this study, a deeper epidemiological understanding of P. maydis has been uncovered. The corresponding research suggests the development and increase in P. maydis stroma under field conditions is primarily driven by extended periods of moderate mean ambient temperature (18–23 °C) but tar spot is discouraged by extended periods of high relative humidity (> 90%). Additionally, the development of multiple statistical models offers a tool in production systems to guide fungicide applications to help farmers maximize their return on investment.

From our findings, the weather parameters with the strongest correlations to P. maydis development and severity increase included the 30-day moving averages of either daily minimum temperature or the daily mean temperature (Suppl. Table 1). These two weather parameters were also correlated with P. maydis development and severity increase from the 21-day and 14-day moving averages but were not as highly correlated as the 30-day moving averages (Suppl. Table 1). Specifically, moderately warm air temperatures appear to drive P. maydis development or severity increase, while excessively warm conditions, mean temperatures greater than 23 °C, considerably decreased the probability of P. maydis progression (Fig. 2B). This demonstrates there is a long-term influence of moderate ambient temperature that drives epidemiological processes within the tar spot cycle. These results confirm previous reports by Hock et al.11 that moderate temperatures (17–22 °C) were one of the primary determinants of tar spot progress and severity. Hock et al.11 also reported that during warm seasons with mean ambient temperatures greater than 22 °C, tar spot development was minimal. Moderate mean temperatures could be influencing multiple epidemiological processes within the disease cycle such as germination of initial inoculum, infection of the host, mycelial colonization within host tissue, or the development of the ascomata. Temperature has been well characterized to play an important role in all of these processes in many other fungal organisms21. Additional investigations on the effects of temperature on the development of P. maydis on corn are still needed to further elucidate this relationship.

In addition to temperature, moisture weather variables were consistently observed to influence tar spot development, although to a lesser degree than temperature. Specifically, the 21-day moving average of the daily minimum dew point (DP) had the third greatest overall correlation with P. maydis development or severity increase, but this was actually a negative correlation (Fig. 2, Suppl. Table 1). Many additional moisture parameters were significantly correlated with P. maydis development or severity increase across all three levels of moving averages, such as the RH90 and the nighttime RH90 variables as has been previously reported (Fig. 2). Interestingly these moisture variables are negatively correlated with increasing TS severity. Breunig et al.12 point out that in controlled-environment inoculations frequent misting was only required in the first 5 days after inoculation. After 5 days, misting had to be withheld to produce stroma in these controlled environments. Perhaps leaf wetness is required by the fungus for spore germination and leaf penetration, while excessive moisture later in the infection process can lead to conditions unfavorable for the progression of infection. Regardless, these moisture variables are clearly playing a role in the biological processes driving the development of P. maydis. Variables such as DP and RH are still dependent on temperature. Thus, the relationships presented here demonstrate the complexity that exists between the roles of ambient temperature and moisture on P. maydis development in the physical environment. As Hock et al.11 reported, RH levels greater than 75% were important for tar spot development. We examined the effect of different RH thresholds ranging from 60 up to 95%, and we consistently determined the 90% RH threshold was the most influential of the eight examined RH thresholds for explaining the development and increase in severity of P. maydis, with RH90 being significantly negatively correlated with P. maydis stroma development in all three levels of moving averages (Suppl. Table 1) and resulting in the best fitting models across all RH thresholds (Suppl. Table 2). Our data are like those of Hock et al.11, in that RH was very important in predicting P. maydis development, however, the work presented here suggests that extended periods of high RH are antagonistic to the development of tar spot. These findings seem consistent with Breunig et al.12 that only intermittent periods or wetness are neded to support the development of tar spot. Thus, the results presented here refine our understanding of the role RH plays in the epidemiology of tar spot in the U.S.

Another important objective of this study was to compare LR models to more modern machine learning algorithms. From our study, a RF machine learning algorithm resulted in one of the best models and had the greatest observed model accuracy of 90.1%. The ANN examined in this study did not result in model accuracy as high as several of the LR models developed here. Two LRs, LR4 and LR 6, were highly accurate. Accuracy was further improved by ensembling these two LR models, with an accuracy estimate of 87.4% while either improving or maintaining all of the additional model assessment characteristics (Table 1). Our analyses demonstrate that machine learning algorithms were slightly more accurate in predicting P. maydis development compared to LRs, but a multi-model ensemble using two of the LRs was still comparable in predicting P. maydis development while balancing all goodness-of-fit statistics. These results confirm previous studies on predicting plant diseases with a high degree of accuracy using different machine learning algorithms27,32,33. However, in some studies logistic regressions were reported to still be the most accurate at predicting plant diseases34.

Functionally, LR models may be more useful in actual delivery to farmers and can be easily programmed into smartphone application decision support systems (DSS) as has been previously demonstrated22. The models presented here have high levels of potential for improving the application timings of fungicides for managing tar spot. Tar spot may result in severe yield losses; thus, farmers often rely on multiple fungicide applications during the season, which can equate to high economic and environmental costs. The use of these DSS will guide farmers in optimizing fungicide application timing to protect the plant from the pathogen when it is most likely to cause disease. Furthermore, using these DSS can also eliminate unnecessary applications which benefit the farmers by limiting needless economic inputs and decreasing chemical inputs into the environment.

While these identified models are highly accurate at predicting P. maydis development, there is inherit error associated with any model. This error could be explained by variability in environmental conditions which could not be accounted for, the quantity of initial inoculum, the population structure of the pathogen within the field, or resistance levels among site-years. Additionally, there may have been interrater variability associated with disease ratings, especially since these data were collected from multi-state projects with numerous raters across multiple years. However, we minimized this error with the use of standard area diagrams (CPN) that help improve disease severity estimates35.

The current study sheds light on epidemiological processes that are driving the development of a newly emerged pathogen of corn capable of causing severe disruptions to agricultural production. Overall, extended periods (30 days) of cool temperature appears to be most important for tar spot development, with an apparent interaction with shorter periods (14–21 days) of low moisture conditions. The work presented here has also paved the way for the development of a DSS for tar spot. Work is underway to incorporate these models into the Tarspotter DSS (https://ipcm.wisc.edu/apps/tarspotter/) to further improve tar spot prediction and better inform farmers of risk due to plant disease.

Material and methods

Field trials

Small plot field trials were planted between 2018 and 2022 in the following states: Illinois, Iowa, Indiana, Kentucky, Michigan, Missouri, Ohio, and Wisconsin (Fig. 1). Locally adapted hybrids were used at each location. The use of all plant material in this study did not require any specific permissions or licenses. Trials at each location followed locally recommended management practices such as seeding rates, nitrogen fertilization, and herbicides with a small number of trials overhead irrigated. Field trials in 2018 and 2019 included the use of fungicide applications, but only the non-treated plots were considered for this study. All small plot research trials were designed as randomized complete block designs. No fungicide applications were made in trials conducted between 2020 and 2022. Commercial field sites were also assessed across the Midwest U.S. between 2021 and 2022, and these included fields under regional grower conditions. These commercial fields were not designed for research and served as locations of observing disease development. The conducted field trials were performed with permission from local commercial grower collaborators and were compliant with all institutional, national, and international guidelines and legislation. Additional field information is provided in Supplementary Table 3.

Data collection

Phyllachora maydis ratings started at the R1 growth stage (silking) and continued until the R5-R6 growth stage (dent to full maturity). The number of P. maydis severity ratings during this period ranged from two to seven depending on the site-year. In the small plot trials, P. maydis severity was rated by visually assessing the percentage of P. maydis-induced stroma on the ear leaf of five to ten plants per plot (sub-samples) using a standardized rating scale36, and all ratings were averaged across the entire plot. For each rating date in each site-year, all plot severity scores were averaged for a single severity score for that plot. In the commercial fields, five corn plants were randomly selected across the field and were evaluated using the same protocol as described for small plot trials. For each rating date, disease ratings of the five plants were averaged to calculate a single severity score. The compiled database considered for developing the prediction modeling was the average P. maydis severity of the ear leaf for each assessment day. The severity data were aggregated into a single file. For each location, all ratings were aligned in sequential order by date. A binary delta variable was defined as the increase in severity of P. maydis stroma between two sequential rating dates, such that a delta value of 1 was given for any positive increase in P. maydis severity between two sequential dates. If no increase in P. maydis severity was observed, a delta value of 0 was given. Thus, delta values of 1 define P. maydis increase while delta values of 0 define no increase.

Weather data collection

Site-specific weather data was collected using IBM historical weather services. Hourly average weather data was pulled from this service at a resolution of 4 km grids using GPS coordinates for each field location. The collected weather data included the hourly averages of ambient air temperature (AT, °C), relative humidity (RH, %), wind speed (WS, m/s), dew point (DP, °C), and precipitation (mm/hour). From these hourly weather data, dew point depression (DPD, °C) values were calculated for each hour by taking the absolute value of the difference between the AT and DP. A binary wetness hour variable (WH) was calculated by defining a ‘1’ if the DPD was less than or equal to two, predicting the presence of free water on leaf surfaces, and a ‘0’ was defined if the DPD was greater than two37,38. Additionally, a binary nighttime wetness hour variable was calculated similarly to the previously described wetness hour variable but could only be considered true between the hours of 8 pm and 6 am. All other daytime hours were considered a ‘0’ value. A binary RH variable (RH95) was calculated by defining a ‘1’ if the RH was greater than or equal to 95%, and a ‘0’ was defined if the RH level was less than 95%. Additional binary RH variables (RH90, RH85, RH80, RH75, RH70, RH65, and RH60) were calculated similarly with RH thresholds of either 90%, 85%, 80%, 75%, 70%, 65%, or 60%. A binary nighttime RH90 variable was calculated by defining a ‘1’ if the RH was greater than 90% between the hours of 8 pm and 6 am, and if RH was less than 90% at night or during daytime hours all hours were defined as a ‘0’.

From these hourly weather values, daily mean, minimum, and maximum values were calculated for each of the following variables (AT, RH, WS, DP, and DPD). Daily mean and daily maximum precipitation rates were calculated. Daily totals for WH, nighttime WH, RH90, RH90, RH85, RH80, RH75, RH70, RH65, RH60, and nighttime RH90 were also calculated for each location. After all daily means, minimums, maximums, and totals were calculated, 30-day, 21-day, and 14-day moving averages (window-panes) were calculated for each of the weather variables using the rollmean() function from the ‘zoo’ package in R39,40. “Window-paning” has been useful in modeling for Fusarium head blight for instance, allowing epidemiologists the ability to find and define specific time-frames for weather variables that are influential in plant disease development41. Finally, the previously established binary delta values were paired with the 30-day, 21-day, or 14-day moving averages of weather data for the second rating date.

Correlation analysis and logistic regression model development

First the total dataset was split to create training (n = 406) and testing (n = 182) datasets using bootstrapping with replacement. Correlation analyses were performed in R using the rcorr() function from the ‘Hmisc” package42. These analyses calculated the Pearson correlation coefficients for the delta values with respect to either 30-day, 21-day, or 14-day moving averages (windowpanes). Significant correlations were determined by a P-value of less than 0.05 (Suppl. Table 1). All LRs were developed with the delta variable as the response variable, as a method to predict the increasing development of tar spot. Single variable LRs were created by using each of the 30-day, 21-day or 14-day moving averages for each of the weather parameters previously described. Additional multi-variable LRs were developed using a combination of these weather variables. All LRs were evaluated by Akaike information criterion (AIC) values, area under the receiver operating characteristics curve (C statistic) using the Cstat() function from the ‘DescTools’ package in R43, and tested by the Hosmer–Lemeshow goodness of fit test (HL test) using the hltest() function from the ‘glmtoolbox’ package in R44. Favorable models were determined as having the lowest AIC values, the highest C statistics, and a HL test P-value of greater than 0.05. From these assessments, eight LR models (LR1-LR8) were identified for further evaluation. Additionally, a multi-model ensemble was created by taking the daily average risk probability from the LR4 and LR6 models. An exhaustive approach was performed to examine all other multi-model ensembles, but the ensemble pursued was determined to be the best fitting model.

Evaluation against machine learning algorithms

To evaluate if the developed LR models were adequately predicting the progression of P. maydis on corn plants, the eight best-fitting LR models and ensemble model were compared against two different machine learning algorithms. These included random forests (RF) and artificial neural networks (ANN). From the training dataset, the delta response variable was examined to be explained by all predictor variables using the randomForest() function from the ‘randomForest’ package in R45 using a total of 500 trees and all other default hyperparameters were used. The subsequent RF model was then tested on the testing dataset to determine the accuracy of predicting the delta response variable. The training set was also used to create an ANN using the neuralnet() function from the ‘neuralnet’ package in R46 using nine hidden layers and all other hyperparameters were set to their default. This ANN was then used to evaluate the ability to predict the delta response variable from within the test dataset. Model fitness metrics compared to the testing data set for the eight LR models, the ensemble model, and the two machine learning models were evaluated for their accuracy (%), kappa values, type I error (%), type II error (%), precision (%), and recall (%). These metrics were evaluated for each model using the confusionMatrix() function from the ‘caret’ package in R47.