The Imperial College Storm Model (IRIS) Dataset

Sparks, Nathan; Toumi, Ralf

doi:10.1038/s41597-024-03250-y

Download PDF

Data Descriptor
Open access
Published: 24 April 2024

The Imperial College Storm Model (IRIS) Dataset

Scientific Data volume 11, Article number: 424 (2024) Cite this article

417 Accesses
Metrics details

Subjects

Abstract

Assessing tropical cyclone risk on a global scale given the infrequency of landfalling tropical cyclones (TC) and the short period of reliable observations remains a challenge. Synthetic tropical cyclone datasets can help overcome these problems. Here we present a new global dataset created by IRIS, the ImpeRIal college Storm model. IRIS is novel because, unlike other synthetic TC models, it only simulates the decay from the point of lifetime maximum intensity. This minimises the bias in the dataset. It takes input from 42 years of observed tropical cyclones and creates a 10,000 year synthetic dataset of wind speed which is then validated against the observations. IRIS captures important statistical characteristics of the observed data. The return periods of the landfall maximum wind speed are realistic globally.

A North Atlantic synthetic tropical cyclone track, intensity, and rainfall dataset

Article Open access 25 January 2024

Generation of a global synthetic tropical cyclone hazard dataset using STORM

Article Open access 06 February 2020

Estimation of global tropical cyclone wind speed probabilities using the STORM dataset

Article Open access 10 November 2020

Background & Summary

Tropical cyclones (TCs) are extreme weather systems which, when they make landfall, can become a deadly natural disaster causing deaths and billions of dollars worth of loss per year¹. Under climate change, TCs, also known as hurricanes and typhoons, are expected to become more intense and therefore even more damaging^2,3. Reliable estimates of TC risk are desirable and can help mitigate TC hazard impact. However, TCs are relatively infrequent events, and those which make landfall are even more scarce. This makes assessing even the current climate baseline risk a challenge. The relatively short length of the reliable historical record compounds this difficulty.

A popular approach to this problem is creating synthetic sets of TCs which share important properties of observed TCs. These datasets may be arbitrarily long and therefore overcome the problem of scarcity in the observations. Generating large sets of synthetic TCs using full-physics numerical simulations is problematic for two main reasons. Firstly, running global climate simulations at resolutions which resolve TCs effectively is very computationally costly hence it is not feasible to generate the thousands of years required by the risk modelling community using this method. Secondly, there are still many biases present in TCs simulated in climate models.

Statistical methods are therefore often preferred and have been popular in recent decades. An early model for the North Atlantic⁴ re-sampled observed genesis locations and derived stochastic tracks from these based on models fitted per 5 degree box. Hybrid statistical-physical models couple statistical elements (e.g. track) to physics simulations⁵ (e.g. intensity). These models may require a large quantity of input data and be computationally expensive. Furthermore, landfall statistics may not be well-represented and are subjected to bias correction⁶. A physics-based statistical downscaling model forced by environmental fields reasonably reproduces observed landfall return curves in some global sub basins⁷. Fully statistical models have also been developed at the regional scale^8,9 which are quick and cheap to run. A global model has also been recently proposed¹⁰. All the above models simulate TCs beginning at the point of genesis, when the TC first comes into being as a weak storm. Errors or biases in the simulation of genesis, intensification, track, and decay of TCs are all likely to lead to biases in the simulated landfall statistics.

The main purpose of these types of models is to simulate TCs at landfall. A key insight¹¹ is that landfall maximum wind speed can be reliably estimated from the position and value of the life-time maximum intensity. The problem then becomes a decay only problem. This assumption short circuits much of the life-cycle which avoids the accumulation of errors occurring in other models. It also allows us to simulate other important variables such as the size and pressure-wind relationship specifically for this critical decay phase of the life-cycle, rather than assuming one relationship throughout when very different statistical relations may apply in different phases of the TC life-cycle. This is a novel approach not followed by other models^4,6,8,9,10. Here we present a new model, the ImpeRIal college Storm model (IRIS), to test and demonstrate the utility of this assumption.

Methods

Model overview

IRIS is a stochastic TC wind model capable of producing thousands of years’ worth of TCs at relatively low computational cost (the dataset described here was generated in a few hours on a 40-core workstation). A key novel component of IRIS is that only post-LMI TC tracks are simulated. This mean no biases or errors associated with the simulation of the intensification stage of TC development can affect the model. Meanwhile the most hazardous and critical, post-LMI, stage of the TC, including landfall, receives full focus. IRIS simulates basins and years independently. For a given basin and model year, IRIS randomly samples post-LMI tracks from the entire 42-year historical record and perturbs and extends these tracks as required. Each TC track is assigned a new initial maximum wind speed (LMI). From LMI a decay parameterisation governs the maximum wind speed until dissipation. Decay over land and ocean are different. A size is calculated by a parametric radial wind profile for each TC which evolves along its track. Finally, IRIS simulates the minimum pressure. Running IRIS proceeds in three main stages: (1) Data preparation, (2) parameter fitting, and (3) simulation. These stages are described below.

Data preparation

IRIS requires input in the form of: historical TC track data, climatological surface pressure, climatological wind speeds at the TC steering level, and fields of climatological MPI. We describe how each of these are obtained.

TC track data

We extract TC track data from the International Best Track Archive for Climate Stewardship (IBTrACS, v04r00) World Meteorological Organisation data^12,13. Tracks were restricted to seasons in the range 1980 to 2021, a period of 42 years, to coincide with the era of satellite observations. We use data from the U.S. agencies, which report wind speeds as 1-minute average sustained winds and span the 6 main TC basins. Irregular time steps are removed such that all data are sampled at 3 hour intervals. Only tropical storm systems which reach at least category 1 wind speeds at LMI are considered in this model, extra- and post-tropical systems are excluded. The LMI of each storm is identified and time steps prior to the last occurrence of the LMI are excluded. In this sense all the tracks in the input data are decaying. Finally, decaying TC tracks are terminated at the point the wind speed falls below the tropical storm threshold. Figure 1 shows the TC track input data after processing.

Meteorological fields

The ERA5 reanalysis product provided monthly mean sea level pressure¹⁴, and wind velocity at 500 hPa¹⁵. These data were obtained at quarter degree resolution and used to provide climatological monthly mean values calculated over the observation period for the input TCs described above (1980–2021).

Potential intensity

The principal physical constraint on the model is through the thermodynamic state as defined by the potential intensity which depends on the sea surface temperature (SST) and the vertical temperature and humidity profile. Daily maps of PI were calculated for the observation period. We used the modified algorithm^16,17 based on the theoretical model of Emanuel¹⁸. This algorithm requires SST, surface pressure¹⁹, and vertical profiles of temperature and humidity²⁰. ERA5 data at half degree resolution was used for all these variables. After calculating daily values, monthly mean and climatological monthly mean values were calculated. PI values maybe affected by the presence TCs in the reanalysis data, but we expect this effect to be small in the monthly means and climatology.

Parameter fitting

Count

Annual basin count distributions are approximately Poisson with Poisson parameter, λ, equal to the annual mean count. We can therefore simulate the annual basin count, n, as

$$n \sim {\rm{Pois}}(\lambda ),$$

(1)

where λ is the mean annual basin count in the input data.

Track

In IRIS, observed “parent” tracks are perturbed to create stochastic “child” tracks. The scale of perturbation is derived from National Hurricane Centre (NHC) forecast cones²¹. These cones describe how the uncertainty of TC centre location typically grows with forecast time in numerical weather prediction models. A cone surrounding a track forecast contains the observed track in two thirds of cases. Figure 2 shows estimates of cone size provided by the NHC as a function of forecast time for North Atlantic, Eastern Pacific and Central Pacific TC forecasts. The rate of cone growth in the three regions is very similar, linear in time, and approximately equal to 0.025 deg/hr which is assumed globally in IRIS.

A “child” track perturbed from its “parent” by a displacement growing in time at a rate drawn from a zero-mean normal distribution with a standard deviation of 0.025 deg/hr (i.e. a constant translation velocity perturbation) will fall within its parent’s cone approximately two thirds of the time. Hence we choose this as the track perturbation model.

The “child” tracks we generate can be considered counterfactual of the historic, “parent”, track.

LMI

For each TC we calculate the PI at the time and location of its LMI. We tested different averaging of PI - monthly mean, monthly mean climatology, monthly maximum, monthly maximum climatology, daily, three days prior to LMI - and found very similar results for all. We confirm that the relative intensity, the ratio of LMI to PI, is uniformly distributed across observations using monthly mean PI values²². The cumulative frequency distribution of relative intensity (LMI/PI) is shown in Fig. 3. The cumulative frequency plots and linear fits (R² > 0.99) show that the distribution is indeed highly uniform. The location and value of the relative intensity is therefore the key physical basis in the model. This has the additional benefit that the observational data from the satellite era are most robust at LMI and the application of PI (relative intensity) is most appropriate at the point of LMI in the TC life-cycle.

We can then define the relative intensity as uniformly distributed:

$$\frac{LMI}{PI} \sim {\mathscr{U}}(0.4,1.0)$$

(2)

where the upper bound is chosen as 1.0, the theoretical maximum, and the lower bound chosen as 0.4 which is less than 99% of the observed LMI/PI.

Decay

We calculate the TC decay from LMI assuming a physically informed algebraic decay, recently shown to be suitable over ocean¹¹ and land²³. Maximum surface wind speed as a function of time, Vmax(t), is given by,

$$Vmax(t)={\left[1/LMI+\kappa t\right]}^{-1},$$

(3)

where LMI is the speed at t = 0 and k is a decay coefficient treated as a constant for a given TC decay. In previous work Wang and Toumi¹¹ have shown no simple dependence of k on environmental conditions such as the wind shear or SST. It can thus be simulated as a stochastic process. The key determinants for the landfall wind speed are the LMI and decay time. The decay time itself is closely controlled by the location of LMI. k is much larger over land and treated separately.

First we separate each TC track into “legs” which occur exclusively over either ocean or over land. We then fit Eq. 3 to each decay leg using a least squares method to estimate k. For ocean decay we find that log k is distributed approximately normally with a dependence on LMI:

$$log\,\kappa ={m}_{a}LMI+{c}_{a}+{\varepsilon }_{a},{\varepsilon }_{a} \sim {\mathscr{N}}(0,{\sigma }_{a}^{2}).$$

(4)

The constants m_a, c_a and standard deviation, σ_a, of the noise term, ε_a, are determined through least squares fitting to the global observations. The data and fits are shown in Fig. 4a. The decay over land is treated differently. We find that the fractional surface area of ocean (i.e. not land), F_O, in a three degree radius surrounding the point of landfall is strongly correlated with with the decay coefficient, with a larger landfall ocean fraction leading to a slower decay as expected²³.

$$log\,\kappa ={m}_{b}{F}_{O}+{c}_{b}+{\varepsilon }_{b},{\varepsilon }_{b} \sim {\mathscr{N}}(0,{\sigma }_{b}^{2}).$$

(5)

where parameters are estimated via least squares regression as above and data and fits are shown in Fig. 4b.

Size

We use an axisymmetric modified Rankine wind speed profile²⁴ given by,

$$V(r)=\left\{\begin{array}{cc}Vmax\left(\frac{r}{RMW}\right) & r\le RMW\\ Vmax{\left(\frac{RMW}{r}\right)}^{\alpha } & r > RMW\end{array}\right.$$

(6)

where V(r) is the wind speed at radius r, Vmax is the maximum wind speed, RMW is the radius of the maximum wind speed and a is a shape parameter and is defined by two measurements of wind speed at different radii (not within RMW). Since RMW and the radius of storm strength winds (~18 m/s), R18, are the most widely available size observations we choose these to give,

$$\alpha =\frac{log\frac{Vmax}{18}}{log\frac{R18}{RMW}}.$$

(7)

A weak dependence of TC size on intensity has been observed^25,26, as has an important increase of size during the decay of a TC which contributes to the footprint of any damage²⁷. We represent this behaviour within our model. For simplicity we calculate the TC size over the ocean as the TC wind speed decays from LMI to at least 25 m s⁻¹. Below 25 m s⁻¹ the shape parameter becomes ill-defined as Vmax approaches 18 m s⁻¹ and R18 approaches RMW. The “final” size is near dissipation at Vmax = 25 m s⁻¹. TCs have an initial size at LMI and final size at Vmax = 25 m s⁻¹. The shape parameter is evaluated at LMI and Vmax = 25 m s⁻¹ using Eq. 7. The results are not sensitive to the 25 m s⁻¹ cut-off choice.

We first characterise the final size parameters, RMW₂₅ and R18₂₅. They both have approximately log-normal distributions²⁸ which can be justified theoretically²⁹ and are significantly correlated (Fig. 5a). We therefore treat these quantities as joint log-normally distributed:

$$\left(\begin{array}{c}logRM{W}_{25}\\ logR1{8}_{25}\end{array}\right) \sim {\mathscr{N}}\,\left[\left(\begin{array}{c}{\mu }_{RM{W}_{25}}\\ {\mu }_{R1{8}_{25}}\end{array}\right),\left(\begin{array}{cc}{\sigma }_{RM{W}_{25}}^{2} & \rho {\sigma }_{RM{W}_{25}}{\sigma }_{R1{8}_{25}}\\ \rho {\sigma }_{RM{W}_{25}}{\sigma }_{R1{8}_{25}} & {\sigma }_{R1{8}_{25}}^{2}\end{array}\right)\right]$$

(8)

where μ, σ and ρ are the mean, standard deviation, and correlation of the logged quantities.

We find that radius of maximum wind at LMI, RMW_LMI, is related to both Vmax at LMI and RMW₂₅ (Fig. 5b,c):

$$RM{W}_{LMI}={m}_{1}LMI+{m}_{2}RM{W}_{25}+{c}_{1}+{\varepsilon }_{1},{\varepsilon }_{1} \sim {\mathscr{N}}(0,{\sigma }_{1}^{2}).$$

(9)

Similarly the initial shape parameter α_LMI is related to the initial RMW and α₂₅ (Fig. 5d,e)):

$${\alpha }_{LMI}={m}_{3}RM{W}_{LMI}+{m}_{4}{\alpha }_{25}+{c}_{2}+{\varepsilon }_{2},{\varepsilon }_{2} \sim {\mathscr{N}}(0,{\sigma }_{2}^{2}),$$

(10)

where m_i, c_i and σ_i are constants to be estimated through multiple linear regression to the observation data of the decay phase.

Wind pressure relationship

Since IRIS is exclusively focused on post-LMI behaviour, we apply a new, unified relationship to determine the TC central minimum pressure across all basins for the decay phase. The size and latitude can modify the pressure wind relationship³⁰ throughout the life-cycle, so we also include this effect:

$${P}_{def}=aVmax+bVma{x}^{2}+cR18+df+\varepsilon ,\varepsilon \sim {\mathscr{N}}(0,{\sigma }^{2})$$

(11)

where a,b,c,d and σ are constants suitable for the decay estimated by multiple linear regression on the IRIS input data, f is the Coriolis parameter at the latitude of the TC, and P_def is the pressure deficit approximated by

$${P}_{def}={P}_{clim}-{P}_{min}$$

(12)

where P_clim is the monthly climatological pressure at the TC centre location. Figure 6 shows the observed and well-predicted central pressure during decay using the above equation.

Simulation steps

Each basin and simulation year are generated independently. For a given year and basin, the number of TCs, n, is randomly sampled according to Eq. 1. Then n “parent” tracks are randomly chosen (with replacement) from that basin’s pool of historical input TCs (42 years’ worth). From each of these parent tracks (beginning at LMI), a stochastic “child” track is created by first perturbing the parent track location by constant longitude and latitude displacements drawn randomly from a normal distribution with zero mean and standard deviation of 1.0 degree. Then a displacement which grows in time at a constant rate drawn from a zero-mean normal distribution with standard deviation of 0.025 degrees per hour, reflecting the uncertainty cone growth, is applied as a subsequent perturbation. The child track is extended by extrapolating its translational motion based on its final 12 hours. Extrapolated motion relaxes to the monthly mean climatological steering wind (wind velocity at 500 hPa) on a timescale of five days. This can then represent the climatology of track curvature at higher latitudes, for example. The TC month is taken directly from the parent TC.

TC intensity simulation begins at LMI. First the climatological monthly mean PI at the month of the parent track and location of LMI is identified. The relative intensity (LMI/PI) is then sampled according to Eq. 2. Multiplying the relative intensity by PI then gives the LMI for a given TC. From LMI the intensity decays according to Eq. 3 with a decay constant given by Eq. 4. If the TC track makes landfall, a land decay constant is sampled via Eq. 5 and the TC decays accordingly. If the TC makes a subsequent seafall it decays with its original ocean decay constant. Some observed TCs have reintensified after making landfall and decaying, but this behaviour is not simulated in IRIS. If the TC moves to a location with PI below tropical storm intensity or to within 5 degrees of the equator the TC enters a fast exponential decay with a time constant of 6 hours. Decay continues until the TC intensity drops below tropical storm intensity.

Generating the size parameters is a three phase procedure. First the distribution specified in Eq. 8 is sampled from to produce a near-dissipation set of size parameters (RMW₂₅, R18₂₅) valid at Vmax = 25 m/s. This process may yield a non-physical combination (RMW₂₅ > R18₂₅), so to prevent this R18₂₅ is set to be at least 1 km greater than RMW₂₅. The size parameter, α, at Vmax = 25 m/s, is then calculated using Eq. 7. Then RMW and α at LMI are simulated using using Eqs. 9, 10 respectively, then R18_LMI is calculated via 7. This process may occasionally produce very large R18_LMI so these are capped at 600 km. Finally RMW and R18 are calculated for each time step along the track through linear interpolation. For values below Vmax = 25 m/s, R18 is calculated through 7 with α set to α₂₅ to ensure R18 approaches RMW as Vmax approaches 18 m/s as required. The size influences the minimum pressure. Equations 11, 12 are used to generate P_min values with a single constant noise term per TC to maintain consistency.

The above method was used to generate 10,000 years of synthetic TCs in each of the six basins. The results of this simulation are analysed below.

Data Records

The IRIS 10,000 year TC event set is available from figshare³¹. The data are stored in simple space-delimited text files with each line representing a single time step of a single TC. Each file has 1,000 years of TC data for a given basin. Descriptions of the columnar contents of the data are provided in Table 1.

Table 1 Description of columns in IRIS dataset.

Full size table

Technical Validation

Summary statistics

Since the IRIS model takes IBTrACS TC observation data as its input, we perform an initial validation by comparing summary statistics of the two. Key points in a TC lifecycle represented in IRIS are the LMI and landfall, if it occurs. We therefore present analysis of TC parameters at these points per basin and globally in Table 2 and Fig. 7 based on the 10,000 years of simulated IRIS data. Following a previous approach¹⁰, we do not intend to test for significant differences from the observations, but instead are satisfied if important TC risk metrics are broadly well-represented in IRIS.

Table 2 Summary statistic of observations (IBTrACS) and 10,000 years of IRIS simulation.

Full size table

The IRIS mean annual counts per basin and globally match exactly those in IBTrACS because the Poisson count model takes a single parameter (mean count) from the input data, so over a long enough simulation, the counts will converge to those in the input data. The variability of annual count is also well-represented in IRIS indicating the Poisson assumption is adequate. The LMI behaviour is more complex and leads to small biases across the basins and a global bias of −1.6 m/s. This bias is likely due to the choice of distribution to represent the relative intensity (LMI/PI). The variability of LMI is also similar in both observations and simulation with a small bias of +0.2 m/s in the interquartile range globally. The minimum pressure, Pmin, at LMI also performs well in the simulation with a global bias of only +2.1 hPa and very similar interquartile range. There is a difference in the sign of the bias across basins, likely due to the choice of using a global pressure wind relationship, rather than using a per-basin formula as is common practice in best track type observation data.

Of greater importance from a risk perspective are characteristics of landfalling TCs. The annual frequency of landfalling TCs is well-represented, with a small positive bias of only +0.9 landfalls per year globally in IRIS compared to the observations. This bias compares favourably with others¹⁰. The mean maximum wind speed at landfall is 39.4 m/s in the observations, compared to 40.7 m/s in IRIS. The sign of the bias varies across basins. The interquartile range of Vmax at landfall is similar in IRIS (21.3 m/s) to the observations (19.5 m/s). Pmin at landfall has a small global bias of −2.5 hPa but again the sign differs across basins as above.

The global mean radius of maximum wind speed, RMW, is slightly smaller modelled (40.4 km) than observed (41.9 km). In contrast the model global mean radius storm-strength winds, R18, at landfall is a bit larger (196 km) compared to the observations (193 km). The sign of the bias differs across basins. The discrepancy is likely the result of several factors. It is worth noting that the size parameterisation is global, so differences in size between basins is not explicitly accounted for. Furthermore, the size parameterisation is not fitted to landfall data but simulated. Finally any bias in mean landfall Vmax will also affect the landfall size.

Spatial coverage

A primary purpose of synthetic TC track data is to fill in the gaps in observed TC tracks which exist due to the limited period or reliable historical observation data. Figure 8 shows the tracks of a sample of 1000 years of IRIS output and maybe compared with the 42 years of observed tracks in Fig. 1. The IRIS tracks form a much more comprehensive coverage of the major TC regions with all vulnerable coastal areas densely populated with tracks allowing for a good analysis of risk not possible in the observation set. IRIS has very few very long tracks extending beyond their origin basin. Some tracks extend inland over continents further than in the observed tracks. This is a natural consequence of the statistical extrapolation performed by this type of model where not all relevant physics is present. For example, ocean decays are terminated when the MPI drops below tropical storm intensity, typically at higher latitudes, but this measure does not include, for example, a vertical wind shear term which may result in earlier and faster dissipation. For decay over land, no explicit adjustment is made for orographic effects which again may lead to earlier dissipation and reduce continental penetration. Overall IRIS performs as desired, increasing the coverage and density of tracks while preserving the basic observed pattern.

Return period analysis

Perhaps the most important validation concept from a risk perspective is that of the return period and return value. The return period is the inverse occurrence frequency of an event of at least a given magnitude, the return value. The shape of return period curves are sensitive to the upper tails of the return value distribution. By comparing IRIS and observed return period we can asses the ability of IRIS to produce extreme events statistically compatible with those observed. We split the 10,000 years of synthetic tracks into 238 ensembles of 42 years, which is the time span of observations. The return curves for events within 2° of three sample locations vulnerable to landfalling TCs in each basin were calculated. The spread of these curves then offers an estimate of the sampling error present in the observed curves. Observed and simulation ensemble curves are shown Fig. 9. First, it is clear from this analysis that IRIS is capable of generating events outside the range of the input data. Observed return values are generally within the range of the simulated ensemble return values across all return periods in all basins. In some locations (e.g. Honolulu, Fiji) the observed return values are toward the upper range of the simulated values. In others (e.g. New Orleans, Hong Kong) observations are towards the lower end of the simulations. This may be due to model bias in these areas or due to local sampling uncertainty in the observation record bearing in mind that the observations are only one outcome of plausible histories.

Given that IRIS compares well with the observations in this extreme event validation which is more relevant than the mean summary statistics, we can state that the IRIS dataset is suitable for the purposes of TC hazard and risk analysis.

Usage Notes

The IRIS dataset³¹ is based on the mean of the 42-year input observations. The 10,000 year output is fixed to the “current” 1980–2021 TC climate rather than a prediction of any sort. The IRIS model produces independent synthetic years, so the dataset has no inter-annual correlation which may be important for some applications, for example, calculating multi-year risk. The magnitude of the inter-annual variability in the dataset resembles that of the 42-year observation period which may not contain the full range of the multidecadal variability of the Earth climate system.

Furthermore, the observations upon which the IRIS dataset is based represent only one realisation of the 1980–2021 climate and as such is a limited sample. We have tested the sensitivity of IRIS to removing significant observed events from its input set (not shown) and find the results are robust on regional and sub-regional scales. However we do still expect some impact of the limited observation sample size on the IRIS dataset and note that it is representative only of the observed climatology.

Basins are simulated independently so any inter-basin dependence which may be present in the observations are not present in IRIS. This may lead to over- or underestimates of TC metrics when basins are aggregated.

Some studies suggest there may have been a poleward shift of LMI in some basins. We do not attempt to model trends in LMI location, or any TC phenomena, over the observation period and hence the output may not be representative of the current climate in 2023, but rather the mean climate of the observation period.

Care should be taken interpreting the dataset in regions where it is likely the storm systems may no longer be considered tropical, i.e. at high latitudes. The dataset is not intended to represent non-tropical systems, and since extra- and post-tropical systems were excluded from the fitting process and input track data, the decay and size models may not be appropriate for these storms. There is no treatment of extratropical transition in the model.

Finally, the IRIS track model is based on perturbing observed tracks by an amount consistent with the error in current TC track forecasts. The forecast error of TC tracks has reduced over time as models have improved but one study suggests that we may have reached the limit of predictability of TC tracks³². Therefore, although somewhat subjective, the current forecast error cone may be considered an estimate of the inherent unpredictability of TC tracks, but we accept this may change over time. Furthermore, the data we used to define the cone extend only to 5 days and the assumption of linear growth may not be true beyond this. However, we note that all the validation metrics presented above are insensitive to this choice of track perturbation model and applying no error-cone perturbation to the “parent” track changes simulated global mean landfall rates and intensities by less than 1% and basin values by less than 2%. However, some regions when examined on smaller spatial scales may be more sensitive to the choice of track perturbation model.

Code availability

The IRIS code is publicly available (https://github.com/njsparks/iris) and a release of the version described here has been archived³³.

References

Bank, W. & Nations, U. Natural hazards, unnatural disasters: the economics of effective prevention (The World Bank, 2010).
Mendelsohn, R., Emanuel, K., Chonabayashi, S. & Bakkensen, L. The impact of climate change on global tropical cyclone damage. Nature Climate Change 2, 205–209, https://doi.org/10.1038/nclimate1357 (2012a).
Knutson, T. et al. Tropical Cyclones and Climate Change Assessment: Part II: Projected Response to Anthropogenic Warming. Bulletin of the American Meteorological Society 101, E303–E322, https://doi.org/10.1175/BAMS-D-18-0194.1 (2020).
Vickery, P. J., Skerlj, P. F. & Twisdale, L. A. Simulation of Hurricane Risk in the U.S. Using Empirical Track Model. Journal of Structural Engineering 126, 1222–1237, https://doi.org/10.1061/(ASCE)0733-9445(2000)126:10(1222) (2000).
Emanuel, K., Ravela, S., Vivant, E. & Risi, C. A statistical deterministic approach to hurricane risk assessment. Bulletin of the American Meteorological Society 87, 299–314, https://doi.org/10.1175/BAMS-87-3-299 (2006).
Article ADS Google Scholar
Lee, C. Y., Tippett, M. K., Sobel, A. H. & Camargo, S. J. An environmentally forced tropical cyclone hazard model. Journal of Advances in Modeling Earth Systems 10, 223–241, https://doi.org/10.1002/2017MS001186 (2018).
Lin, J., Rousseau-Rizzi, R., Lee, C.-Y. & Sobel, A. An Open-Source, Physics-Based, Tropical Cyclone Downscaling Model With Intensity-Dependent Steering. Journal of Advances in Modeling Earth Systems 15, e2023MS003686, https://doi.org/10.1029/2023MS003686 (2023).
James, M. K. & Mason, L. B. Synthetic Tropical Cyclone Database. Journal of Waterway, Port, Coastal, and Ocean Engineering 131, 181–192, https://doi.org/10.1061/(ASCE)0733-950X(2005)131:4(181) (2005).
Haigh, I. D. et al. Estimating present day extreme water level exceedance probabilities around the coastline of Australia: tropical cyclone-induced storm surges. Climate Dynamics 42, 139–157, https://doi.org/10.1007/s00382-012-1653-0 (2014).
Article ADS Google Scholar
Bloemendaal, N. et al. Generation of a global synthetic tropical cyclone hazard dataset using STORM. Scientific Data 7, 1–12, https://doi.org/10.1038/s41597-020-0381-2 (2020).
Article Google Scholar
Wang, S. & Toumi, R. On the intensity decay of tropical cyclones before landfall. Scientific Reports 12, 1–8, https://doi.org/10.1038/s41598-022-07310-4 (2022).
Knapp, K. R. et al. The International Best Track Archive for Climate Stewardship (IBTrACS). Bulletin of the American Meteorological Society 91, 363–376, https://doi.org/10.1175/2009BAMS2755.1 (2010).
Knapp, K. R., Diamond, H. J., Kossin, J. P., Kruk, M. C. & Schreck, C. J. International Best Track Archive for Climate Stewardship (IBTrACS) Project, Version 4. NOAA National Centers for Environmental Information. https://doi.org/10.25921/82ty-9e16 (Accessed on 03-FEB-2022) (2018).
Hersbach, H. et al. ERA5 monthly averaged data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.f17050d7 (Accessed on 13-APR-2023) (2023).
Hersbach, H. et al. ERA5 monthly averaged data on pressure levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.6860a573 (Accessed on 13-APR-2023) (2023).
Gilford, D. M. pyPI (v1.3): Tropical Cyclone Potential Intensity Calculations in Python. Geoscientific Model Development 14, 2351–2369, https://doi.org/10.5194/gmd-14-2351-2021 (2021).
Bister, M. & Emanuel, K. A. Low frequency variability of tropical cyclone potential intensity 1. Interannual to irtterdecadal variability. Journal of Geophysical Research Atmospheres 107, ACL 26–1–ACL 26–15, https://doi.org/10.1029/2001JD000776 (2002).
Emanuel, K. A. An Air-Sea Interaction Theory for Tropical Cyclones. Part I: Steady-State Maintenance. Journal of the Atmospheric Sciences 43, 585–605, 10.1175/1520-0469(1986)043<0585:AASITF>2.0.CO;2 (1986).
Hersbach, H. et al. ERA5 hourly data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.adbb2d47 (Accessed on 15-FEB-2023) (2023).
Hersbach, H. et al. ERA5 hourly data on pressure levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.bd0915c6 (Accessed on 15-FEB-2023) (2023).
National Hurricane Center. Definition of the NHC Track Forecast Cone. https://www.nhc.noaa.gov/aboutcone.shtml. Accessed: 2022-10-18 (2022).
Emanuel, K. A Statistical Analysis of Tropical Cyclone Intensity. Monthly Weather Review 128, 1139–1152, 10.1175/1520-0493(2000)128<1139:ASAOTC>2.0.CO;2 (2000).
Phillipson, L. M. & Toumi, R. A Physical Interpretation of Recent Tropical Cyclone Post-Landfall Decay. Geophysical Research Letters 48, e2021GL094105, https://doi.org/10.1029/2021GL094105 (2021).
Mallen, K. J., Montgomery, M. T. & Wang, B. Reexamining the Near-Core Radial Structure of the Tropical Cyclone Primary Circulation: Implications for Vortex Resiliency. Journal of Atmospheric Sciences 62, 408–425, https://doi.org/10.1175/JAS-3377.1 (2005).
Wang, S. & Toumi, R. A historical analysis of the mature stage of tropical cyclones. International Journal of Climatology 38, 2490–2505, https://doi.org/10.1002/joc.5374 (2018).
Sparks, N. & Toumi, R. The Dependence of Tropical Cyclone Pressure Tendency on Size. Geophysical Research Letters 49, e2022GL098926, https://doi.org/10.1029/2022GL098926 (2022).
Article ADS Google Scholar
Wang, S. & Toumi, R. On the relationship between hurricane cost and the integrated wind profile. Environmental Research Letters 11, 114005, https://doi.org/10.1088/1748-9326/11/11/114005 (2016).
Chavas, D. R. & Emanuel, K. A. A QuikSCAT climatology of tropical cyclone size. Geophysical Research Letters 37, https://doi.org/10.1029/2010GL044558 (2010).
Wang, S. & Toumi, R. An analytic model of the tropical cyclone outer size. npj Climate and Atmospheric Science 5, 1–10, https://doi.org/10.1038/s41612-022-00270-6 (2022).
Knaff, J. A. & Zehr, R. M. Reexamination of tropical cyclone wind-pressure relationships. Weather and Forecasting 22, 71–88, https://doi.org/10.1175/WAF965.1 (2007).
Sparks, N., & Toumi, R. IRIS: The Imperial College Storm Model Dataset, figshare, https://doi.org/10.6084/m9.figshare.c.6724251.v1 (2023).
Landsea, C. W. & Cangialosi, J. P. Have We Reached the Limits of Predictability for Tropical Cyclone Track Forecasting? Bulletin of the American Meteorological Society 99, 2237–2243, https://doi.org/10.1175/BAMS-D-17-0136.1 (2018).
Sparks, N. J. njsparks/iris: IRIS: Imperial college storm model. Zenodo https://doi.org/10.5281/zenodo.10948611 (2024).
Copernicus Climate Change Service (C3S). Copernicus Climate Change Service, Climate Data Store, (2023): ERA5 monthly averaged data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.f17050d7 (Accessed on 13-APR-2023) (2023).
Copernicus Climate Change Service (C3S). Copernicus Climate Change Service, Climate Data Store, (2023): ERA5 monthly averaged data on pressure levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.6860a573 (Accessed on 13-APR-2023) (2023).
Copernicus Climate Change Service (C3S). Copernicus Climate Change Service, Climate Data Store, (2023): ERA5 hourly data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.adbb2d47 (Accessed on 15-FEB-2023) (2023).
Copernicus Climate Change Service (C3S). Copernicus Climate Change Service, Climate Data Store, (2023): ERA5 hourly data on pressure levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), https://doi.org/10.24381/cds.bd0915c6 (Accessed on 15-FEB-2023) (2023).

Download references

Acknowledgements

Tropical cyclone data are available from the International Best Track Archive for Climate Stewardship version 4: https://www.ncei.noaa.gov/products/international-best-track-archive¹³. ERA5 monthly average single level data¹⁴ were downloaded from³⁴. ERA5 monthly average pressure level data¹⁵ were downloaded from³⁵. Hourly ERA5 single level data¹⁹ were downloaded from³⁶. Hourly ERA5 pressure level data²⁰ were downloaded from³⁷. National hurricane centre forecast cone statistics were obtained from https://www.nhc.noaa.gov/aboutcone.shtml on 22-OCT-2022. The research was supported by the Vodafone Foundation, the UK Centre for Greening of Finance and Investment (UKRI-NE/V017756/1) and the Singapore Green Finance Centre.

Author information

Authors and Affiliations

Imperial College London, Department of Physics, London, SW7 2AZ, UK
Nathan Sparks & Ralf Toumi

Authors

Nathan Sparks
View author publications
You can also search for this author in PubMed Google Scholar
Ralf Toumi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.S. developed the IRIS model and dataset. N.S. and R.T. contributed ideas and were involved in the writing process.

Corresponding author

Correspondence to Nathan Sparks.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sparks, N., Toumi, R. The Imperial College Storm Model (IRIS) Dataset. Sci Data 11, 424 (2024). https://doi.org/10.1038/s41597-024-03250-y

Download citation

Received: 04 July 2023
Accepted: 10 April 2024
Published: 24 April 2024
DOI: https://doi.org/10.1038/s41597-024-03250-y