Introduction

The SARS-CoV-2 pandemic sparked a broad range of interest in both the mechanism and the risks of viral transmission. Early in the pandemic, the mechanism of viral transmission was a matter of controversy, with a claim that transmission occurred either via contact or by the short-range spread of emitted droplets omitting the potential for longer-range airborne transmission1. Subsequent work highlighted the importance of aerosolised particles in causing long-range airborne transmission2 while downplaying the importance of contact-driven events3.

Studies of the risk of transmission examined the relationship between transmission and the environment, with for example higher rates of transmission being found in household compared to workplace environments4. Quantitative models were developed, assessing the risk of infection in a different scenarios5,6,7,8, modelling the relationship between risk and exposure time9, and explaining the role of masks in preventing viral spread10. CO2 monitoring was suggested as a means to evaluate the immediate risk of transmission11.

While risk calculations consider whether a person might be infected, evolutionary biology poses a different question: If a person was infected, how many viruses initiate that infection? This number of viruses, denoted the transmission bottleneck12, has important consequences for virus evolution: The tighter the bottleneck, and the fewer particles get through, the less genetic diversity will be transmitted between individuals. The absence of initial diversity can limit the potential for within-host evolution, as variants need to be generated de novo before evolutionary changes can take effect13.

Studies of genomic data have suggested that for influenza and SARS-CoV-2 infection, the transmission bottleneck generally involves few viral particles14,15,16,17, with potentially a single virus initiating infection. Different genomic approaches have been applied to this question. In animal models, barcoded viruses allow for a straightforward count of the number of viruses initiating infection18. Where barcoding is not possible, deep sequencing of a viral population has been used to assess the appearance or non-appearance of minor variants following the bottleneck19 or to evaluate changes in variant frequencies during the transmission process20,21,22,23. Genomic studies have some limitations. Collecting genomic data from individuals is time-consuming and expensive, while the results of such studies may reflect only the specific circumstances of the individuals involved. The estimation process itself requires some care: the false identification of variants has the potential to inflate the estimated bottleneck size24,25. Errors in identifying who infected who could also potentially distort results.

We here take an alternative approach to estimating respiratory virus transmission bottleneck sizes. Rather than considering only specific circumstances or data, we outline a general solution, exploiting knowledge of the physical processes underlying viral transmission26,27 to build a physical model of virus transmission. Within this model, we exploit knowledge from an extensive past literature28. Coughing, speaking and sneezing have been shown to emit broad and distinct distributions of particle sizes29,30,31. Emitted particles are affected by evaporation, sedimentation and diffusion32. Ventilation reduces the mean concentration of particles in the air, while in the absence of immediately finding a new host, viruses in emitted particles begin to decay33. Combining insights from this literature, we assess the expected transmission bottleneck for infections that occur under a variety of scenarios. Our results provide a strong indication, independent of genome sequence data, that most cases of respiratory virus transmission will involve a tight population bottleneck.

Results

Constant levels of exposure

A simple model, based upon a Wells-Riley model of exposure, suggested that transmission bottlenecks arising from exposure to a respiratory virus are likely to be tight (Fig. 1). We made the simplifying assumptions that all individuals in an environment receive the same level of viral exposure and that viruses cause infection independently of one another. Under these assumptions, bottleneck sizes are small unless a very large proportion of individuals present in an environment are infected. The Skagit choir superspreading event was an extreme case early in the SARS-CoV-2 pandemic whereby between 32 and 52 of 61 people present at a choir rehearsal were infected34. Even in this extreme case, our model predicted that between 33% and 75% of cases of infection were initiated by a single viral particle, with more than 99% of cases being initiated by fewer than 10 viruses.

Fig. 1: Transmission bottleneck estimates under a Wells-Riley model of exposure.
figure 1

In this model, the level of exposure describes the rate parameter of a Poisson model, such that a person receiving an exposure of 1 would expect to be infected by one virus. a The probability of an individual being infected given the level of exposure. b The proportion of cases of infection in which a single virus initiates infection. c The proportion of cases of infection in which ten or fewer viruses initiate infection. The vertical grey bar provides an estimate of circumstances at the Skagit Choir superspreading event, characterised by the probability of infection. Even under these circumstances, the model suggests that transmission bottlenecks are likely to be small.

Variable levels of exposure

More complex models of exposure produced similar results, suggesting that the transmission bottlenecks produced by respiratory infection are generally tight. A straightforward approach to expanding our initial model is to incorporate overdispersion into the exposure levels; this did not substantially change the results obtained (Supplementary Fig. 1, Supplementary Note 1). To achieve a more realistic estimate of the extent to which exposures vary, we implemented a physical model of virus transmission. Our model describes the emission by an infected individual of virus-containing particles with a distribution of sizes, by default modelling a process of coughing. Emitted particles may contain more than one virus, according to their size. Particles are subject to evaporation and spread through the air by diffusion. They are lost from the air due to ventilation and sedimentation. Viruses within particles are inactivated over time (Fig. 2a).

Fig. 2: Method for simulating transmission events.
figure 2

a A computational model described the emission and subsequent dynamics of virus-containing particles following a single cough. We modelled the diffusion of particles of different sizes through space and time, accounting for evaporation, sedimentation, ventilation, and the inactivation of viruses within infectious particles. Our model describes the time- and location-dependent concentration of infectious material within an environment. b Our model facilitates the calculation of the cumulative volume of infectious material that we would expect for different individuals in an environment. Specifying an effective viral load, or alternatively the parameter Renv, which describes the expected number of infections to occur within an environment, generates viral exposures, which describe the expected number of infectious viruses that initiate infection within each person: The outcome of exposure, whether infection or non-infection, is characterised by this viral exposure.

Levels of physical exposure were calculated based on estimated inhalation rates and then converted into viral exposures (Fig. 2b). Our model describes an effective viral load, defined as the number of viruses per ml of emitted material that initiate infection, having overcome the various barriers, whether physical or immunological, to achieve this. The effective viral load is by nature smaller than the absolute number of viruses contained within an emitted particle: One study has estimated the proportion of emitted SARS-CoV-2 viruses that are viable (measured experimentally via plaque- or focus-formation in cells facilitating infection) as roughly 1 in 300035. Plaque formation is likely a necessary requirement for a virus to cause infection but may not be sufficient: A virus that would form a plaque under laboratory conditions might not be able to cause infection in a host.

Applied to four different environments and run under default parameters, our model suggested that respiratory viral infection arising following coughing is associated with a tight transmission bottleneck. Clear environmental impacts upon exposure were evident, with the highest exposure occurring at close proximity in the poorly ventilated lounge. In the bus, a very broad distribution of exposures was found, with the simulated absorption of particles by the sides of the bus leading to low exposures far from the infected person (Supplementary Fig. 2). Transmission bottlenecks were not universally tight: One of the multiple simulations we generated describing the nightclub environment included a case where 391 viruses initiated infection. However, in all the environments and under our default parameters, more than 98% of transmission events were predicted to involve ten or fewer viruses, with the majority of cases of infection being initiated by a single viral particle (Fig. 3).

Fig. 3: Bottleneck size distributions calculated for different scenarios.
figure 3

Maps show the layouts of different environments. A red dot indicates the location of an infected person, with the red arrow showing the direction in which emissions occur. A white dot indicates the location of an uninfected person. In our model, individuals were assumed to remain stationary. Furniture did not affect the model and is shown for purely illustrative reasons. Data show the room dimensions. Ventilation levels are described by the number of air changes per hour (ACH). The value Renv describes the expected number of people infected in an environment during the modelled time of exposure. Bottleneck size distributions show empirical probabilities calculated from an ensemble of 106 simulations generated for each environment.

The outputs from our model include a parameter, Renv, describing the expected number of cases of infection occurring in each environment. This parameter is akin to the common epidemiological parameter R0. Where R0 describes the expected total number of infections caused by an infected individual during the entire course of an infection in the absence of population immunity36, Renv describes the expected number of infections caused by an infected individual in a specific environment, given the number of uninfected individuals present, their relative positions, the length of time spent in that environment, and the prevailing environmental conditions. Under our default parameters, these numbers were generally small, ranging from 0.058 in the bus to 0.271 in the nightclub, reflecting the limited time modelled in each scenario. The value of 0.067 in the office environment is a feature of our default model, calculated from the R0 value of the original Wuhan strain SARS-CoV-2 virus (see Supplementary Methods 1). Keeping the level of physical exposure constant, an increase in the effective viral load leads to an increase in Renv.

Large bottlenecks at very high effective viral load

Under our default model, tight transmission bottlenecks were inferred to dominate in all but exceptionally high values of the effective viral load (Fig. 4a, b). Most transmission events involved 10 or fewer viral particles unless the effective viral load was greater than 109.2 per ml. This value is greatly in excess of an estimated upper bound for the number of plaque-forming units at peak viral load during SARS-CoV-2 infection35. At this concentration, high transmission bottlenecks occur following the inhalation of even a single emitted particle: A particle of radius 10 μm would be expected to contain more than 6 effective viruses.

Fig. 4: Inferred statistics of transmission bottleneck sizes given changes in the effective viral load.
figure 4

Statistics were calculated from an ensemble of 106 simulations for each combination of environment and effective viral load. Lines connect points calculated at different viral loads. The dashed vertical black line shows the mean number of plaque-forming units at the peak of SARS-CoV-2 infection, while the grey shaded area shows a 95% confidence interval for this statistic35. The solid vertical black line shows the effective viral load as specified in the default parameters of our model. Statistics are shown describing (a). The proportion of transmissions with bottleneck size 1. b The proportion of transmissions with bottleneck size 10 or less. c The expected number of people infected in each environment, Renv. Horizontal dashed lines in this figure indicate limit values for each environment, as would occur given a theoretically infinite effective viral load.

In most of the environments we simulated, one person coughing was not sufficient for everyone present to inhale a single emitted particle. As such, not everyone in the environment was infected, even at extremely high simulated effective viral loads (Fig. 4c). Alternative models described higher levels of particle emission. For example, a model of continuous, uninterrupted speech, while allowing for asymptomatic transmission, generated higher volumes of particles than did coughing. Increased volume excepted patterns of physical exposure from speaking were similar to those derived from coughing (Supplementary Fig. 3). The increased volume emitted led to an increase in the values of Renv in each environment. Compared to the coughing model, a decrease in the proportion of infections initiated by a single viral particle was inferred for the lounge environment, but otherwise, results were very similar (Supplementary Fig. 4). Most transmission events still involved 10 or fewer viral particles unless the effective viral load was greater than 108 per ml, again substantially above published estimates for SARS-CoV-2 infection (Supplementary Fig. 5).

Large bottlenecks during extreme superspreading

Modelling identified a second scenario in which larger bottlenecks could prevail. If a highly effective viral load is combined with a very large volume of infectious material is emitted, most infections are initiated by more than 10 viral particles. We generated variants of the coughing model in which the volume of particles emitted was arbitrarily increased. At exceptionally high volumes of emission, the need for individuals present to inhale an emitted particle is no longer a consideration; nearly everyone present receives some physical exposure. This implies that, above a threshold effective viral load, everyone is likely to be infected, while at some higher threshold effective viral load, everyone is likely to be infected by more than 10 viral particles. The exact values of thresholds were environment-dependent, being affected by the distribution of physical exposures.

With a 1000-fold increase in emission volume, simulation data suggested that most infections in the office environment were likely to be initiated by more than 10 viral particles if the effective viral load was in excess of 107.5 per ml (Fig. 5). At this threshold Renv was close to the hypothetical maximum value of 8, such that everyone present was highly likely to be infected. For the nightclub, a threshold effective viral load implied a Renv value close to 100. Our lounge environment was designed to maximise physical exposure, with a single individual being in prolonged short-range proximity to an infected individual in an environment characterised by poor ventilation. At 1000-fold increased emission volume, most infections in the lounge were likely to be initiated by more than 10 viral particles at an effective viral load of close to 106, still high but within biological plausibility. In the bus the high level of variation in physical exposure levels meant that 1000-fold increased emissions were still not sufficient to infect everyone present.

Fig. 5: Effect of increasing the volume of particles emitted upon the expected number of individuals infected and upon transmission bottleneck statistics.
figure 5

Data show outputs from a model of coughing with increased emission volume. Bottleneck sizes were calculated from an ensemble of 106 simulations for each combination of environment, volume emitted, and effective viral load. Where environments contained multiple uninfected people, increases in the volume of particles emitted led to substantial increases in the number of people infected. However, at all but the highest viral loads, most transmission bottlenecks still involved few viral particles.

This high-emission, high-viral load scenario represents exposure to overwhelming numbers of viruses. The identification of superspreading events early in the SARS-CoV-2 pandemic37,38 suggests that such a scenario could be biologically plausible. Behaviours such as singing or shouting would generate higher volumes of emission than our model of speech39. However, basic epidemiology suggests that these events are rare: the single-figure values of R0 associated with most respiratory viruses are not compatible with a situation in which infected people transmit the disease to the majority of their contacts.

Conclusions

Concluding our analysis, we note that individual cases of transmission involving multiple viral particles may arise under unspectacular circumstances: In our default model between 1 and 3% of transmissions involved more than 10 viruses. However, a scenario in which the majority of transmission events involve more than 10 viruses is unlikely, requiring a very high effective viral load, possibly combined with abnormally high levels of particle emission. Our model is parameterised in a way that is consistent with SARS-CoV-2 infection but is not specific to that virus. Where a virus is spread by respiratory transmission, if R0 is not exceptionally high, the physical process of airborne transmission will lead to mostly tight transmission bottlenecks.

Our model may be elaborated in a variety of ways, considering, for example, the movement of people within an environment, emissions via sneezing, changes in ventilation levels, or variable levels of infectivity. None of these changes produced substantial changes in our basic results. Details are given in Supplementary Note 2.

Discussion

We have here applied two distinct modelling approaches to consider the transmission bottleneck sizes generated by respiratory viral transmission. In a first, highly simplified approach, we showed that, in a case where all exposed individuals receive an equal level of exposure, the Poisson assumption underlying the Wells-Riley model implies that the great majority of transmission events involve a small number of viral particles, even in cases where a high proportion of individuals present are infected. We next considered a more complex, though still approximate method, in which different individuals received different exposures, calculated from a physical model describing particle emission and spread. In this latter model, we identified that the airborne transmission of viruses is dominated by tight transmission bottlenecks in all but two cases. Firstly, if the effective viral load is sufficiently high, the inhalation of a single emitted particle will result infection by several viruses so that transmission bottlenecks will be high irrespective of the level of exposure. Secondly, where the effective viral load and the volume of emitted particles are both very high, most cases of infection will again involve large numbers of viruses; this latter case is associated with a high proportion of individuals present being infected. We believe that each of these two cases represents rare circumstances. In the former case, the effective viral load needed is greatly in excess of a published estimate of the number of plaque-forming units of SARS-CoV-2 at peak infection. In the second case, the very large number of infections generated by transmission is so high as to imply either that these circumstances are very unusual or that the virus has an extremely high value of Renv and, therefore, of R0.

The results of our model are consistent with previous studies of transmission bottlenecks that have used viral genome sequence data. For example, a study of influenza virus transmission suggested that between 28 and 31 (73–82%) of a set of 38 transmission events were likely to have been founded by a single virus15,40: Under default parameters, our model produced similar results.

Our approach is distinct from previous work in that, not relying upon genomic data describing any particular virus or circumstance, our result is a general one. In this sense, our work is predictive: If there were to be an outbreak of a novel virus spreading by airborne transmission, our model suggests that the transmission of that virus would be characterised by tight transmission bottlenecks.

While our model of particle spread captures the basic features of respiratory virus transmission, it still makes multiple simplifications. For example, our model neglects effects arising from convection currents caused by individuals in a room41. Effects such as these have the potential to generate non-monotonic levels of exposure with distance from an infected person, as particles are carried up and across the ceiling before falling to a height at which they can be breathed in. Our model also neglects the effects of local humidity on particle spread42, as well as any detailed description of ventilation, such as the placement of windows, ceiling vents, and air conditioning units. Such effects are likely to have a distorting effect on the patterns of exposure we identify, potentially enhancing disparities in exposure. The potential effects of a highly skewed exposure distribution are represented by our simulated bus environment where the long and thin shape of the bus leads to extreme disparities in exposure (Supplementary Fig. 2). These disparities place an effective ceiling on Renv (Fig. 5, Supplementary Fig. 5), with some people out of reach of emitted particles. The transmission bottlenecks we inferred for the bus were not substantially distinct from those of other environments.

The simplifications involved in our model limit direct comparison with real-world scenarios. A recent publication suggested an infection risk of slightly under 10% for individuals in the most risky scenarios after 8 h of exposure43, which compares to 18.7% or 74.9% in our default lounge models of coughing and continuous speech: As discussed in Supplementary Methods 1, our default parameters likely overestimate the effective viral load in a realistic situation. We note that a complete accounting for transmission would require an account of the precise distributions of emissions, viral load, and time-dependent proximity between individuals, alongside environmental parameters and a detailed description of human behaviour.

One area of uncertainty relevant to our model is the relationship between the raw numbers of viral particles contained in emitted material, the number of plaque-forming units (PFU) this represents, and the true effective viral load. The raw number of viruses in emitted material, known as the viral load, follows a pattern of growth and then decay during the course of infection, which in SARS-CoV-2 infection reach a peak potentially of 1010 viruses per ml44. In our default model, we have used parameters from one study describing SARS-CoV-2 infection, which suggest a 3000-fold ratio between the raw number of viruses and the number of PFUs35, further assuming a 1:1 ratio between PFUs and the effective viral load. We note considerable variation in the literature on this ratio. Experimental work has suggested a strain-dependent ratio between SARS-CoV-2 viral load and focus-forming units, a measure in some ways similar to PFU, of between 104:1 and 106:145, while a challenge study of SARS-CoV-2 infection estimated close to a 105:1 ratio46. The relationship between PFU and the TCID50, the dose needed to initiate infection in 50% of individuals, is a topic of some controversy, with a review of the subject identifying estimates spanning several orders of magnitude, from 1.26 to 7 × 106.25 PFU47. Modelling studies have attempted to estimate directly the ratio between raw and effective viral loads, with a study of super spreading events concluding on the basis of a Wells-Riley model suggesting a ratio of between 2000:1 and 300:148. If we assume that the ability to form plaques under favourable experimental conditions is a necessary condition for a virus to cause infection in a host, and we allow for flexibility given the assumptions underlying modelling studies, our 3000:1 ratio is likely at the conservative end of the spectrum. Studies of viral load in other respiratory viruses show similar values to SARS-CoV-2. Data from infections with parainfluenza and respiratory syncytial virus show peak PFU levels around 106/ml49,50. Studies of influenza show mixed results, with peak PFU values often between 104 and 106/ml but with occasional cases potentially reaching 109 PFU/ml51. An interesting case is that of measles infection: The very high reported R0 for this virus52 makes this a potential case where transmission bottlenecks in unvaccinated individuals may be higher.

Uncertainty in the literature also exists around the precise distributions of particle sizes emitted via coughing, speaking and sneezing. While our method exploits experimental results, studies of these processes have historically used different methods and are not in perfect agreement. While we would not be confident about building a combined model of particle emission, combining, for example, speaking and sneezing, the finding that our basic result holds across such distinct models supports the robustness of our conclusions.

A final simplification in our model is the neglect of interactions between viruses, which could increase or decrease bottleneck sizes. Some interactions, such as those characterised by superinfection exclusion, are likely to reduce the number of cases of large transmission bottlenecks. In many cases of acute respiratory infection, a virus-founding infection leads to the rapid growth of viral particles53, such that after a given amount of time, any subsequent infection will involve the addition of a tiny fraction of the current within-host population. This, alongside the triggering of innate host immune responses54 and other cellular interactions55, limits the window of time available for new viruses to infect a host. Other interactions between viruses involving cooperation have the potential to increase the proportion of bottlenecks involving multiple virions56. Where single virions contain incomplete functional genomes, more than one may be required for a cell to produce a complete genome57,58. A consideration of viruses with incomplete genomes would require a more nuanced definition of what is meant by a transmission bottleneck.

Despite its limitations, the generalisability of our model and the reproducibility of our result across a broad range of scenarios provide what we believe is a compelling explanation for past observations of tight transmission bottlenecks in respiratory virus transmission. Where the number of cases of infection in a scenario is limited, as represented by a moderate value of Renv, most people exposed to an infected person are not themselves infected, incurring an effective transmission bottleneck of zero. Where infectious particles spread through the air via diffusion, it is difficult to generate patterns of exposure that combine cases of non-infection with cases of infection that exclusively involve large bottlenecks. The mechanism of airborne respiratory virus transmission leads to tight transmission bottlenecks.

Methods

Wells-Riley model

The Wells-Riley model adopts a Poisson assumption about infection. Suppose that a person receives a level of exposure E, by which we mean that the expected number of viruses causing an infection is equal to E. Then the probability of an individual being infected is given by

$$P\left({{\mbox{infection}}}\right)=1-{{\rm {e}}}^{-E}$$
(1)

While the derivation of E can be complex, involving multiple individual and environmental factors, we here simply considered a range of possible values for this statistic.

We denote the population bottleneck at transmission by N. Given a Poisson model, the proportion of cases of infection initiated by a single viral particle is given by

$$P\left(N=1\right)=\frac{E{{\rm {e}}}^{-E}}{1-{{\rm {e}}}^{-E}}$$
(2)

Similarly, the proportion of cases of infection initiated by ten or fewer viral particles is

$$P\left(N\le 10\right)=\frac{{\sum }_{k=1}^{10}\frac{{E}^{k}{{\rm {e}}}^{-E}}{k!}}{1-{{\rm {e}}}^{-E}}$$
(3)

These formulas were used to estimate transmission bottlenecks under different levels of exposure.

Transmission mediated by the airborne spread of infectious particles

In order to estimate how exposures to a virus might vary in a given environment, we built a model of the airborne spread of viruses within a room. Considering the first instance SARS-CoV-2 infection, we modelled the behaviour of particles emitted by an infected individual with a cough, measuring the subsequent exposure of others in the room.

Considering a single coughing event, we estimated for each environment the exposure E(x,r), describing the volume of emitted infectious material comprised of emitted particles of radius r to which an uninfected person at position x = {x,y} would be exposed over a period of time. This expression was calculated by a process of summation: We generated an expression for the time-dependent exposure EC(x,r,t), occurring t seconds after a single cough, and then summed over time and coughing events.

Diffusion model

To calculate EC(x,r,t), we estimated the concentration of infectious material contained in particles of radius r μm at x = {x,y} and time t following a single emission event. This concentration is altered by the emission of infectious particles into the environment, the spread of particles through space, the loss of particles via evacuation and sedimentation, and the inactivation of viruses contained within particles. We write

$$\begin{array}{c}\frac{\partial {{{{\rm{c}}}}}}{\partial {{{{\rm{t}}}}}}=\overbrace{I(x,t)}^{{{{{{\rm{Particle}}}}}}\,{{{{{\rm{emission}}}}}}}+\underbrace{K(L,Y)\left(\frac{{{\partial }^{2}}{{{{{\rm{c}}}}}}}{\partial {{{{{{{\rm{x}}}}}}}^{2}}}+\frac{{{\partial }^{2}}{{{{{\rm{c}}}}}}}{\partial {{{{{{{\rm{y}}}}}}}^{2}}}\right)}_{{{{{{\rm{Turbulent}}}}}}\,{{{{{\rm{diffusion}}}}}}}-\overbrace{{B}({r},{{{{{\rm{c}}}}}},{{{{{\rm{\gamma }}}}}})}^{{{{{{\rm{Evacuation}}}}}}}-\underbrace{S(r,c)}_{{{{{{\rm{Sedimentation}}}}}}}-\overbrace{D(c)}^{{{{{{\rm{Inactivation}}}}}}}\end{array}$$
(4)

We consider the parts of this equation in turn.

Emission of infectious particles

Coughing leads to the emission of a distribution of particle sizes: This distribution has been studied via a range of experimental means29,59. Following this literature, we modelled particles emitted from a cough as following a lognormal distribution30. We simulated particles with radii r {1, 2, …, 500} μm, with particles being emitted in quantities proportional to the function

$$f\left(r\right)=Q\left(2r,u,s\right)-Q\left(2\left(r-1\right),u,s\right)$$
(5)

where Q is the cumulative distribution function of the lognormal distribution

$$Q\left(d,\, u, \, s\right)=\frac{1}{2}\left[1+{{{{{\rm{erf}}}}}}\left(\frac{{{{{\mathrm{ln}}}}}\,d-u}{s\sqrt{2}}\right)\right]$$
(6)

with the parameters u = 2.60269 and s = 0.69314760. We assumed that the infected person coughed 10 times per hour at regular intervals61.

The velocity of particles following a cough falls rapidly, within a fraction of a second62. We, therefore, described a cough as instantaneously creating a cloud of particles at mean radial distance of 20 cm from the infected person (standard deviation 5 cm) and with a spread angle 45%63. By default, the volume of liquid emitted from a cough was set to equal 38 pl64. Altering the initial mean radial distance of the cloud of particles had only a small impact on exposure levels (Supplementary Fig. S6).

We also investigated models of particle emission by coughing and sneezing. Descriptions of these models, and further details of the emission model, are provided in Supplementary Information.

Evaporation

Emitted particles evaporate over time, the removal of liquid leaving behind a smaller solid particle with a radius of approximately one-quarter of that which was emitted65,66,67. This process occurs relatively quickly, with, for example, a droplet of size 20 μm evaporating in under a second67 and a droplet of size 55 μm evaporating within an estimated 14.5 s68. Given the overall timescale of our model, we assumed that the process of evaporation is short, such that a particle of radius r0 was, upon emission, instantaneously reduced to the new size r = r0/4. In the following description, we refer to particles according to their radius at the time of emission.

Turbulent diffusion

Once emitted, particles spread through the air via diffusion. Both Brownian motion and air turbulence potentially contribute to this, though at the size of particles we consider, it is likely that turbulent diffusion will dominate over Brownian motion69; our model therefore neglected the effects of Brownian motion.

The extent of turbulent diffusion depends upon how well a room is ventilated, with more frequent replacement of the air in a room, or a larger room, each requiring a higher mean rate of particle movement. We adopted a model based on the experimental measurement of air in a domestic environment70. This approach defined a characteristic length scale for a room by

$$L=\root 3 \of{{XYZ}}=\root 3 \of{V}$$
(7)

where V is the volume of the room in m3. Our model then links K, the turbulent diffusion coefficient, to L, and γ, the number of changes of the air in a room per hour:

$$\frac{K}{{L}^{2}}=0.52\gamma+0.31{h}^{-1}$$
(8)

Within our model, this becomes

$$K\left(L,\gamma \right)=\left(0.52\gamma+0.31\right){L}^{2}$$
(9)

Evacuation

We interpret the air change rate γ using the cutoff radius theory presented by Bazant et al.6. Under this model, the evacuation rate is the same as the air replacement rate for droplets below a cutoff radius given by

$${r}_{c}=\sqrt{\frac{9\gamma L{\mu }_{{\rm {a}}}}{2g\Delta \rho }}$$
(10)

where μa is the dynamic viscosity of air, g is the acceleration due to gravity, and Δρ is the difference in densities between water and air. Above the cutoff radius, the evacuation rate scales with 1/r2, with heavier particles being less subject to air movement. Where

$${r}_{*}=\max \left\{r,{r}_{{\rm {c}}}\right\}$$
(11)

we have

$$B\left(r,c\right)=-\gamma {\left(\frac{{r}_{{\rm {c}}}}{{r}_{*}}\right)}^{2}c$$
(12)

Sedimentation

Emitted particles will be affected by gravity, with heavier particles falling to the floor more quickly than lighter particles, according to Stokes’ Law. In our model we approximated this process by the simple removal of particles from the air over time. We followed a previous approach, which balanced diffusion and gravitational terms to approximate the time taken for a particle to fall to the ground68. We have

$${t}_{{{\rm {sed}}}}=\phi \frac{{z}_{0}}{{r}^{2}}$$
(13)

where z0 is the initial height of the particle, r is the particle radius, and ϕ is calculated as 0.85 ×10−8 ms. From this, we derived the sedimentation term

$$S\left(r,c\right)=\frac{c}{{t}_{{{\rm {sed}}}}}$$
(14)

The initial height of particles z0 was defined according to whether individuals in an environment were standing or seated. We made the assumption that the floor absorbs particles, with no rebound being possible.

Virus inactivation

Viruses within emitted particles are intrinsically unstable, such that the number of infectious particles in each droplet decays over time. An experimental study has suggested a half-life for SARS-CoV-2 of around 1.1 h33. The viral titre in each droplet is therefore given by

$$N\left(t\right)={N}_{0}{{\rm {e}}}^{-\lambda t}$$
(15)

where N0 is the initial number of particles in the droplet and λ is the decay constant. To model a half-life of 1.1 h, we set λ = 0.6301 h−1. We then have

$$D\left(c\right)=-\lambda c$$
(16)

Solution of the diffusion equation

By default, we made the assumption that, upon hitting a wall, particles are absorbed, either impacting upon the wall due to electrostatic or inertial forces71 or being caught in downward convection currents leading to their deposition on the floor41. By means of a sensitivity analysis, we also considered the case in which walls perfectly reflected particles. Under each of these conditions, the solution of Eq. (4) can be expressed analytically. Notes on the solution of the diffusion equation are provided in Supplementary Information.

Calculating individual exposures

We generated values c(x,r,t) at time intervals of one second for a period of one hour following each emission event. The initial values c(x,r,0) were scaled so that the total volume, summed across particle sizes, was equal to the desired volume of the emission. In order to calculate the total exposure of person i at the location xi = {xi, yi}, we generated values c(x,r,t) at positions in the square grid centred on xi, with dimension 40 cm, and containing points at resolution 2 cm, finding the mean value of c in this grid. The volume of the space represented by this box is 0.16Z, where Z is the height of the room so that we can calculate the density of viral particles of radius r in the box. Given a parameter A, describing the rate of air inhalation by a person, we calculated the expected number of particles of radius r inhaled within a  1 second interval at time t. Summing these values over times t, we obtained an expected number of particles of radius r inhaled in a 1 hour period following a single emission event. Summing these values over multiple emission events, we obtained an expected number of particles of radius r inhaled over the entire model period. We denote this number as Pi (r).

For each uninfected individual in our model, we generated a Poisson random variable

$${n}_{i,r}={{\mbox{Poisson}}}\left({P}_{i}\left(r\right)\right)$$
(17)

describing the number of particles of radius r inhaled by that person. This number was converted into a number of effective viruses: For each such particle, the expected number of effective viruses is given by

$${V}_{{\rm {e}}}\left({k}_{{\rm {b}}},r\right)=\left(\frac{4}{3}\right){k}_{{\rm {b}}}\pi {r}^{3}$$
(18)

where kb is the effective viral load of particles at the point of emission. To calculate the effective number of viruses inhaled via particles of radius r, we calculated a second Poisson random variable

$${v}_{i,r}={{\mbox{Poisson}}}\left({{n}_{i,r}V}_{{\rm {e}}}\left({k}_{{\rm {b}}},r\right)\right)$$
(19)

The transmission bottleneck related to the person i was finally calculated as the sum of these values:

$${N}_{i}={\sum }_{r}{v}_{i,r}$$
(20)

Person i was considered to have been infected if and only if Ni > 0. Statistics of bottlenecks were calculated across cases of infection.

For each scenario considered, we calculated 106 independent simulations, generating Ni for each individual in each simulation. Statistics were collated across simulations.

Calculation of k b

By default the effective viral load was calculated using an epidemiological model. Details are given in the Supplementary Information. A broad range of values of kb were considered.

Inhalation

Our model assumes that the process of being exposed does not change the local level of exposure i.e. breathing in viruses does not significantly remove viruses from the air. We explore this assumption further in Supplementary Information.

Environments

We modelled transmission within different environments, including an office, a bus, a nightclub, and a lounge. For each environment, our model was parameterised with the dimensions of the room in metres, X, Y, and Z, the number of uninfected people present, ne, and their locations, the air replacement rate γ, the length of time for which we assumed people were in the environment T, the volume of air breathed in per minute by an individual, A, and the height at which particles were emitted, z0. Parameters for each environment are shown in Table 1.

Table 1 Parameters defining each of the environments explored in our model

In Supplementary Information, we provide further notes on environmental parameters and a description of methods used to model variation in infectivity levels.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.