Introduction

Functional connectivity (FC) refers to patterns of statistical dependence in brain activity, such as the blood oxygen level-dependent (BOLD) signal measured via functional magnetic resonance imaging (fMRI). Static FC is traditionally calculated over the course of an entire scan session and it is an established technique of modern neuroimaging1,2, with individual differences linked to brain disorders and cognitive states3,4. On the other hand, time-varying FC refers to time-resolved fluctuations in FC, typically estimated by fitting dynamic models5,6,7,8 or by sliding windows9,10. Differences in time-varying FC at rest are also associated with a wide range of cognitive and behavioural traits, as well as psychiatric and neurological conditions (see ref. 11 for a recent review of this rapidly growing field).

To avoid the temporal blurring caused by sliding windows9, it is possible to analyse BOLD fluctuations at the resolution of single frames. Coactivation patterns (CAPs)12 and point-process analyses13 are early examples of this approach. Initially using a single seed region, both found that static seed-based FC maps can be reliably approximated by averaging only a few high-amplitude frames. Whole-brain extensions of both methods have soon followed14,15. More recently, edge-centric approaches have generated excitement as they also go beyond single seeds and additionally decompose the entire FC matrix into its frame-wise contributions by omitting traditional time averaging over the length of the experiment16,17. Such temporal unwrapping of the FC results in a large number of edge time series, each capturing the moment-to-moment cofluctuations of a pair of brain regions. Going one step further, one can measure the similarity between every pair of edges to obtain a large matrix, referred to as edge FC (eFC), as opposed to the traditional node-centric FC (nFC). The eFC is shown to be replicable, stable within individuals across multiple scanning sessions and reliable across datasets17. Furthermore, clustering the eFC yields overlapping brain communities that could better suit the study of aspects of cognition and behaviour that transcend traditional disjoint brain parcellations. While CAPs focus on patterns of BOLD activity, edge-centric approaches focus on patterns of cofluctuations between all regional pairs. However, the analogous finding is that the static nFC can be faithfully approximated by averaging a few instantaneous connectivity patterns. These are characterised by simultaneous large cofluctuations across all node pairs, measured as the root-sum-of-squares (RSS) over all the edge time series values corresponding to the same frame. Such brief, intermittent, and high-amplitude cofluctuations drive the nFC and the network structure over these time points contributes disproportionately to the overall modularity of the functional brain network16.

Although the edge-centric decomposition of nFC into its frame-wise constituents is mathematically exact, a comprehensive treatment of the statistical properties of the edge-centric measures is lacking and there is a consensus on the need for appropriate null models17,18. A rigorous mathematical study is especially important since several widely-acknowledged publications have warned about the dangers of extracting structure from noise when studying static or time-varying FC, often using minimal null models to reproduce existing results19,20,21,22,23,24. The warnings concerning sampling variability are particularly relevant to edge-centric methods as they represent an extreme case of a single-frame sliding-window approach. High-amplitude cofluctuations can be observed in temporally-uncorrelated synthetic time series such that accounting only for static spatial correlations is sufficient to replicate key empirical findings16(Fig. S4). This observation has been interpreted as further evidence that large cofluctuations are not fMRI artefacts. However, it arguably raises an equally pressing conceptual concern: what information do current edge-centric measures provide beyond the nFC, if any?

Here, we tackle this question mathematically and present a theoretical explanation for the widespread occurrence of large cofluctuations across datasets and why a few large events drive the nFC. This explanation rests on fundamental properties of subexponential distributions25. Further mathematical derivations clarify how the nFC eigenvalues shape the RSS distribution and how the leading nFC eigenvectors underpin the spatial correlation patterns expressed during high-amplitude events. The influence of functional modules on the eigenvalue distribution could explain why these events disappear when the modular structure is disrupted, as recently reported in ref. 26. Finally, we analytically show that the eFC matrix, the edge communities, the large cofluctuations, and the corresponding brain activity modes can all be predicted from the nFC without recourse to the edge-centric formulation. Many of these derivations are based on the null hypothesis of i.i.d. Gaussian variables that only takes into account the observed (static) spatial correlations and ignores temporal features. Under this assumption, and invoking results from random matrix theory27, the edge time series variability is described by the sampling distribution of the nFC, known as the Wishart distribution28. Testing the analytic predictions using fMRI data from the Human Connectome Project (HCP)29 shows that the null model is sufficient to replicate the vast majority of existing edge-centric features both qualitatively and quantitatively, as well as foundational properties of CAPs.

Results

We present six main results showing that the existing findings based on edge time series16,17 can be derived from the static nFC under the null hypothesis of i.i.d. multivariate Gaussian variables that preserve the observed static spatial correlations but not the temporal ones. The theoretical predictions were empirically tested using a HCP dataset comprising 100 subjects, preprocessed with the current standard HCP pipeline, both with and without global signal regression (GSR). The detailed derivations of the presented equations are in the ‘Methods’ section, reserving the current section for a concise account of the key results.

The edge FC matrix can be derived analytically from the node FC

The eFC was introduced in ref. 17 to quantify linear interactions between edges. For each pair of brain regions, an edge time series is computed as the element-wise product of the two regional z-scored time series. Thus, the values of each edge time series represent the instantaneous cofluctuation magnitudes between the corresponding pair of brain regions. The eFC is the edge-by-edge matrix obtained by computing the inner products between all pairs of edge time series, normalised to the [−1, 1] interval.

Our first result is that the eFC can be analytically derived from the nFC under the static Gaussian null hypothesis (Fig. 1a, b). As shown in Eq. (11) of the ‘Methods’ section, the (jk, lm) entry of the eFC matrix is obtained as a sum of pairwise products between the (j, k, l, m) entries of the nFC matrix, divided by a normalisation factor:

$${{{{{{{{\rm{eFC}}}}}}}}}_{jk,lm}=\frac{{r}_{jk}{r}_{lm}+{r}_{jl}{r}_{km}+{r}_{jm}{r}_{kl}}{\sqrt{1+2{{r}_{jk}}^{2}}\sqrt{1+2{{r}_{lm}}^{2}}}.$$
(1)

Using the HCP dataset, the predicted eFC achieves an average Pearson correlation of r = 0.93 with the empirical eFC (r = 0.88 if GSR is applied). The distributions across 100 unrelated HCP subjects are shown in Fig. 1c. This is a significant improvement on the linear regression approach adopted in ref. 17, which achieved an average Pearson correlation of r = 0.72 on pairs of edges not sharing any nodes, but performed poorly otherwise (r = 0.06). Moreover, by revealing the mathematical relationship between eFC and nFC, Eq. (1) explains why the eFC is highly replicable, stable within individuals across multiple scan sessions and consistent across datasets—it will be as long as the nFC is.

Fig. 1: The edge functional connectivity (eFC) can be derived analytically from the node functional connectivity (nFC) under the static Gaussian null model.
figure 1

a The eFC of the first subject from the Human Connectome Project (HCP) fMRI dataset is plotted as an example. This matrix scales with the fourth power of the parcellation size: in the current case of 200 parcels, it has ~400 million elements. The (jk, lm) entry of the eFC matrix (black square at the intersection of the black lines) is computed as the inner product of two edge time series jk and lm, normalised to the [−1, 1] interval. b Under the static null model, the eFC of each subject can be derived analytically as a sum of pairwise products between the (j, k, l, m) entries of the corresponding nFC matrix as per Eq. (1). Since the nFC is symmetric, only six entries are necessary to represent all pairwise combinations (marked by each of the black squares at the intersections of the black lines). c Distribution of Pearson correlation coefficients between the empirical and null eFC computed over 100 HCP subjects. Results are reported with and without global signal regression (GSR) and the mean correlation values are indicated by the dashed lines.

The edge communities can be predicted from the nFC

Clustering the eFC via the k-means algorithm was used to identify communities of co-fluctuating edges. These were then mapped back to individual nodes to obtain overlapping regional communities, offering a new way to study aspects of cognition and behaviour that transcend traditional disjoint brain parcellations17. Given the high similarity between the empirical and null eFC demonstrated in the previous section, it would be reasonable to expect that the edge communities be well recovered by the null model by applying the same community detection algorithm to the predicted eFC. Indeed, the agreement between the empirical and predicted community assignments is exact on 84% of the 19,900 edges, and 74% if GSR is applied (Fig. 2a, b). At the node level, the similarity is even stronger. In ref. 17, the similarity between two nodes, referred to as “edge cluster similarity”, is measured as the fraction of all edge pairs starting from those two nodes (and reaching the same target) that are clustered together. Using the same procedure, the predicted edge cluster similarity achieves a Pearson correlation coefficient of r = 0.96 with the empirical one, and r = 0.95 if GSR is applied (Fig. 2c, d). Note that the rows and columns of the matrices in Fig. 2 have been rearranged to match the ordering of the 16 networks used in ref. 17 (Fig. 6) to facilitate a visual comparison. Given that the analysis is performed on the same HCP dataset, the small differences in the empirical results are most likely due to the different preprocessing pipelines (here we use the preprocessed data made available by the HCP; see the ‘Methods’ section for details). This adds to the evidence that edge-centric measures are not strongly dependent on the preprocessing approach, and similarly speaks to the robustness of the null model predictions.

Fig. 2: The edge communities can be predicted from the nFC under the static Gaussian null model.
figure 2

a Empirical edge communities obtained by k-means clustering of the eFC computed on the HCP dataset. The edge labels are reshaped into an NxN matrix where each (i, j) entry represents the community label of the edge linking i to j. For each edge, the representative community assignment shown here is the statistical mode (i.e., the most common label) across 100 HCP subjects. b Predicted edge communities obtained by k-means clustering of the eFC predicted by the static null model. The agreement with the empirical labels in panel a is exact on 84% of all 19900 edges (74% if GSR is applied). c Empirical edge cluster similarity on the HCP dataset. The similarity of edge communities involving nodes i and j is computed as the fraction of matching elements between the corresponding two rows of the edge community matrix in panel a. d Predicted edge cluster similarity based on the static null model. The Pearson correlation with the empirical edge cluster similarity in panel c is r = 0.96 (r = 0.95 if GSR is applied).

So far, the results in this subsection have been based on simulations. Let us now see how a mathematical derivation can provide further insight into these edge and node similarities. First, recall that the edge communities are obtained by clustering the eFC (e.g. via k-means). The main obstacle to a full analytic approach is that the outcome of stochastic clustering algorithms cannot be entirely predicted from their input; however, since they are usually based on a distance metric, it is reasonable to expect that the smaller the distance between two rows of the eFC, the higher the probability that the corresponding edges would be clustered together. Accordingly, we define the distance between two edges (jk) and \((j^{\prime} k^{\prime} )\) as the 1 norm of the difference between the corresponding rows of the eFC. As shown in Eq. (12) of the ‘Methods’ section, this edge distance simplifies to

$$d_{jk,j^{\prime} k^{\prime}} =\sum\limits_{l,m=1}^N 3 \left\vert{{{{{\mathbf{z}}}}}}_j ({{{{{\mathbf{z}}}}}}^{\top}_l {{{{{\mathbf{z}}}}}}_m) {{{{{\mathbf{z}}}}}}^{\top}_k - {{{{{\mathbf{z}}}}}}_{j^{\prime}} ({{{{{\mathbf{z}}}}}}^{\top}_l {{{{{\mathbf{z}}}}}}_m) {{{{{\mathbf{z}}}}}}^{\top}_{k^{\prime}} \right\vert,$$
(2)

where, for a brain region i, the row vector zi is its z-scored BOLD signal and \({{{{{{{{\bf{z}}}}}}}}}_{i}^{\top }\) denotes its transpose. What does this imply for the node communities? The similarity between two nodes i and j was measured in ref. 17 as the fraction of all edge pairs starting from i and j (and reaching the same target) that are assigned to the same cluster. Here, the analogous analytic step is to compute the distance between nodes i and j as the sum of the distances between the edges starting from them. Crucially, the resulting node distance can be expressed in terms of BOLD signal correlations (see Eq. (13) in the ‘Methods’ section):

$${d}_{i,j}\le c\, {\left(1-{r}_{ij}\right)}^{\frac{1}{2}},$$
(3)

where c denotes a constant term, independent of i and j. This implies that the edge-cluster similarity17 between nodes i and j can be predicted from the corresponding rij entry of the nFC, avoiding the memory-intensive computation of the eFC and computationally-intensive clustering algorithms (the space complexity of the eFC is \({{{{{{{\mathcal{O}}}}}}}}({N}^{4})\), i.e., it scales with the fourth power of the number of regions and requires over a terabyte of memory for fine brain parcellations—for each subject). Using the HCP dataset to test this prediction shows that the nFC alone achieves an average Pearson correlation of r = 0.76 with the empirical edge cluster similarity matrix shown in Fig. 2c. Once again, the static, node-centric, second-order features of the BOLD signal are sufficient to replicate key findings that appear at first to rely on the specific temporal sequence of BOLD cofluctuations at the single-frame resolution.

The null model reproduces the high similarity of the top RSS frames to the nFC

Let us now consider the root-sum-of-squares (RSS) of the edge time series introduced in ref. 16. The RSS is a univariate time series defined as the Euclidean norm of the edge time series vector at each frame. In other words, the RSS peaks when all the cofluctuations (i.e., the edge time series) are simultaneously high in absolute value, either positive or negative. The key finding in ref. 16 is that only a small fraction of frames exhibiting large RSS values are required to explain a significant fraction of variance in the nFC, as well as the network’s modular structure30. Both results are perfectly reproduced by the static null model, as shown in Fig. 3a, b and Supplementary Fig. 1. What is particularly remarkable is that the timing of the high-amplitude RSS events produced by the null model are arbitrary, and yet a small fraction of frames corresponding to these large cofluctuations is still sufficient to explain the observed nFC. Furthermore, Fig. 3c shows that the null model frames with the largest RSS also exhibit high similarity to the empirical frames with the largest RSS from the HCP dataset—occurring at entirely different times. Note that, unlike temporal dependencies, spatial correlations are necessary to reproduce the results: if the null model is chosen to be both temporally and spatially uncorrelated, high-RSS frames are no more similar to the empirical nFC than low-RSS frames (see Supplementary Fig. 2). Another interesting note is that high-RSS frames exhibit the strongest average correlation with all other BOLD frames and this average similarity decreases with the RSS magnitude, both in empirical and null cases (see Supplementary Fig. 3).

Fig. 3: The static Gaussian null model can reproduce the strong correlation between high-amplitude cofluctuation patterns and the nFC.
figure 3

a Only a small fraction of frames exhibiting the largest cofluctuations root-sum-of-squares (RSS) are required to explain most of the nFC variance. The similarity is computed as the Pearson correlation coefficient between the nFC and the average FC estimated from the top and bottom 5% of the total frames. Each point corresponds to one of 100 unrelated subjects from the Human Connectome Project (HCP) dataset, with boxes indicating the quartiles and whiskers length specified as 1.5 times the interquartile range. b The same results hold more generally when the frames are ordered according to the corresponding RSS amplitude, either in descending or ascending order. Here, the curves represent the average similarity over 100 subjects. c The findings do not depend on the timing of the high-amplitude RSS events: the top frames generated by the null model exhibit high similarity to the top frames of the real HCP data, which occur at different times.

A theoretical explanation for these findings will be provided in the ‘nFC eigenvectors underpin spatial patterns of high BOLD activity’ section and supported by detailed derivations in the ‘Methods’ section. Interestingly, most of these points were also reported in ref. 16(Fig. S4), where they were taken as evidence that large RSS events are not fMRI artefacts. While settling that methodological issue, these observations raise a conceptual concern: if matching the timing of the RSS events is not essential and the results can be replicated by the static null model, does the edge-centric approach provide any information about the time-varying connectivity that cannot be explained by the static nFC? We will address this question in the next section, by examining the statistical properties of the RSS.

The RSS distribution is determined by the nFC eigenvalues

Having established that the large RSS events are not an exclusive feature of neural signals, let us investigate how their ubiquitous appearance across datasets can be analytically explained and why the corresponding frames account for the largest fraction of variance in the nFC. As a first step, the RSS can be computed as the squared Euclidean norm of the z-scored BOLD signal z (full derivations provided in the ‘Methods’ section, see Eqs. (14) to (16):

$${{{{{{{{\rm{RSS}}}}}}}}}_{{{{{{{{\rm{all}}}}}}}}}(t)={\left\Vert {{{{{{{\bf{z}}}}}}}}(t)\right\Vert }^{2}.$$
(4)

The intuition behind this equivalence is that summing over the pairwise products of the elements of a vector is the same operation that is performed when squaring a polynomial; in this case, the vector is the squared BOLD signal, its pairwise products are the squared edge time series, and the polynomial is the squared Euclidean norm. The key message is that, although it was introduced in ref. 16 as the Euclidean norm of the edge time series, the RSS is mathematically equivalent to the squared Euclidean norm of the BOLD signal, i.e., a measure of the overall BOLD signal amplitude at each time step, akin to the variance. We can then proceed without resorting to the edge time series, which is not only convenient in practice but also shifts the conceptual focus back to the BOLD time series—which are more readily interpretable. For a large family of common (sub-Gaussian) distributions, the squared Euclidean norm of a random variable (RV) is heavy-tailed (more specifically, it is subexponential31). The RSS being a squared norm, large cofluctuations are then to be expected due to its heavy-tailed distribution (see Eq. (28)), offering an explanation for the large RSS peaks observed in the BOLD time series.

In the specific case of Gaussian variables (i.e., the null hypothesis), the RSS can be expressed as a sum of N independent Gamma (\(k=1/2,\theta =\sqrt{2}{\lambda }_{i}\)) variables, each related to an eigenvalue of the nFC matrix (λi, i = 1…, N). This is summarised by the moment-generating function in Eq. (27). The largest eigenvalues capture the distribution tail and including smaller eigenvalues provides an increasingly complete characterisation of the empirical RSS distribution (Fig. 4b). This distribution can be used for testing the statistical significance of the empirical RSS observed in the HCP dataset against the static null hypothesis of spatially correlated noise. Figure 4a illustrates the convergence of the empirical RSS distribution to the null distribution as more and more time frames are observed (i.e., over longer fMRI sessions). When all the 1200 time frames available in the HCP data are utilised, the null hypothesis cannot be rejected for 58% of the participants at a 5% significance level and on 90% of the participants after Bonferroni correction for multiple comparisons (p-values given by the two-sided Kolmogorov–Smirnov test).

Fig. 4: Properties and convergence of the RSS distribution derived analytically under the static null hypothesis.
figure 4

a The empirical RSS distribution increasingly approximates the null RSS distribution as more realisations (frames) are observed. On the horizontal axis, the cumulative distribution function (CDF) of the null RSS is evaluated for each empirical RSS observation; the CDF of the resulting distribution is plotted on the vertical axis. Identical distributions would produce points on the diagonal (dotted line). b The nFC eigenvalues shape the null RSS distribution. The null RSS distribution increasingly approximates the empirical RSS distribution as more eigenvalues are considered, with the largest eigenvalues explaining most of the variance. On the horizontal axis, the cumulative distribution function (CDF) of the empirical RSS is evaluated for each null RSS observation; the CDF of the resulting distribution is then plotted on the vertical axis. Identical distributions would produce points on the diagonal (dotted line).

nFC eigenvectors underpin spatial patterns of high BOLD activity

High-amplitude BOLD cofluctuations were observed to be underpinned by a particular spatial mode of brain activity in which default mode and control networks are anticorrelated with sensorimotor and attentional systems16. This particular spatial mode was defined as the first principal component of the BOLD activity and, as such, it can be obtained as the largest eigenvector of the static nFC matrix by mathematical equivalence and without recourse to null models (Fig. 5a).

Fig. 5: The static Gaussian null model can explain the spatial patterns of BOLD activity and the cofluctuation patterns that characterise high-amplitude frames.
figure 5

a The spatial mode underpinning high-amplitude cofluctuations is captured by the leading eigenvector of the nFC matrix. The 200 brain regions were partitioned into the same 16 networks used in ref. 16 (Fig. 2e) to facilitate a visual comparison. In addition to a scatter plot of the data, the violin plots show the probability densities and the box plots indicate the quartiles, with the maximum whisker length specified as 1.5 times the interquartile range. The sample sizes of the box plots are the sizes of the 16 networks, that is, n = [16, 15, 6, 14, 17, 6, 12, 10, 14, 16, 10, 19, 15, 6, 12, 12]. b Frames corresponding to large RSS values exhibit high similarity to the leading nFC eigenvectors. The similarity is measured as the Pearson correlation between the instantaneous FC estimate from a single frame and each of the estimates from the four leading nFC eigenvectors. c Higher principal component (PC1) coefficients are associated with large RSS events. In addition to a scatter plot of the data, the box plots indicate the quartiles and the whisker length is specified as 1.5 times the interquartile range (sample size n = 6000). Compare this figure with the results published in ref. 16 (Fig. 2c).

The only question left to answer is whether the RSS is predicted to peak when the BOLD activity aligns with the largest eigenvector. This can be proven to be true even beyond the static null hypothesis (see Eqs. (23) to (24) in the ‘Methods’ section). An intuitive understanding can be gained once the RSS is seen as the fluctuation of the BOLD signal amplitude over time: high-amplitude frames have a larger variance, which is captured by a larger coefficient of the first principal component since the latter is the vector that aligns with the direction of maximum variance (Fig. 5c). We can then refine our theoretical understanding of the RSS peaks: not only do they occur when the Euclidean norm of the BOLD signal vector is large but, most likely, when it is well aligned with the leading eigenvector of the nFC. However, the alignment with the leading eigenvector need not be perfect: in general, large RSS values can be expected whenever the BOLD vector is a mixture of the top eigenvectors (Fig. 5b and Supplementary Fig. 4). Additional principal components are included as more large-RSS frames are averaged, suggesting why the top 5% frames alone are sufficient for an almost perfect reconstruction of the nFC (Fig. 3). We have thus explained why the nFC estimates corresponding to frames with the largest RSS exhibit the highest similarity with the nFC. Moreover, since the nFC features multiple communities, high-RSS frames naturally reflect this property by exhibiting high modularity, as well as higher values than low-RSS frames, which are less similar to the nFC (see Supplementary Fig. 1).

One could speculate that the cofluctuation patterns corresponding to large nFC eigenvectors would be closely related to empirical cofluctuation patterns obtained by clustering high-RSS frames as in ref. 18. For example, the leading nFC eigenvector has the highest similarity to all BOLD frames and the corresponding cofluctuation pattern would resemble the most frequently occurring cluster.

A null model for binary edge time series

It was recently observed that thresholding the edge time series retains most of the information about the nFC since averaging the binarised edge time series over time still yields a very accurate approximation of the nFC both at the voxel15 and parcel level32. As a premise, the latter finding is empirically replicated here using the HCP dataset, with a resulting Pearson correlation r = 0.98 between the average binarised edge time series and the nFC (average correlation over 100 unrelated subjects). How to explain this almost perfect correlation? Furthermore, it was noted that the binary edge time series are highly constrained by the nFC32. What is the nature of such constraints? Both questions can be answered mathematically under the static Gaussian null hypothesis. Consider two brain regions or parcels j and k with z-scored BOLD activity Zj(t) and Zk(t) and their corresponding edge time series Cjk(t) = Zj(t)Zk(t). The thresholded edge time series \(\overline{{C}_{jk}(t)}\) is equal to 1 when the cofluctuations between j and k are positive, i.e., when Zj(t) and Zk(t) have the same sign. Under the null hypothesis, Zj(t) and Zk(t) are normal Gaussian RVs with correlation coefficient denoted as rjk. If the two parcels are uncorrelated (rjk = 0), their sign is positive or negative with equal probability and can be described mathematically as a Bernoulli(1/2) RV, which is a formal way to model a fair coin draw. The corresponding binary edge time series \(\overline{{C}_{jk}(t)}\) can also be modelled as a fair coin draw since the two signs are expected to agree half of the times. The situation clearly gets more complicated in the case of 200 correlated parcels as in the HCP dataset considered here. Intuitively, we would expect \(\overline{{C}_{jk}(t)}=1\) if the two parcels were perfectly correlated and \(\overline{{C}_{jk}(t)}=0\) if they were perfectly anticorrelated, i.e., in consistent disagreement. This intuition can be formalised mathematically and extended to all intermediate cases as shown in Eq. (30) of the ‘Methods’ section. A trigonometric argument proves that \(\overline{{C}_{jk}(t)}\) can be modelled as a biased coin, i.e., a Bernoulli(pjk) RV with success probability

$${p}_{jk}=\frac{1}{2}+\frac{\arcsin ({r}_{jk})}{\pi }.$$
(5)

This result reveals the exact analytic relationship between the binary edge time series and the nFC, answering the second question. However, it is straightforward to show that Eq. (5) also predicts the approximated nFC obtained by averaging the binary edge time series. This follows from the fact that the time average of ergodic processes (including the null model) converges to their expectation, and that the expectation of a Bernoulli RV is equal to its success probability, pjk. Testing this analytic prediction on the HCP dataset confirms its accuracy (see Supplementary Fig. 5b). Finally, we can explain the strong correlation between the original and approximated nFC by noting that pjk in Eq. (5) is very close to rjk for small values of the correlation (see Supplementary Fig. 5a), which are the most frequent ones.

Relationship with coactivation patterns (CAPs)

CAPs12,33 preceded edge-centric approaches in the study of spatial patterns of BOLD activity at the single-frame resolution. These patterns differ from established resting-state functional networks and it was speculated that they may originate from a neuronal avalanching phenomenon. Both CAPs and edge-centric studies report a strong similarity between high-amplitude frames and the nFC, and both employ k-means to assign these frames to different clusters12,18. However, there are important differences between the two methods. CAPs are patterns of node coactivations (i.e., they are defined in node space), while the frame-wise FC patterns studied in edge-centric analyses are defined in edge space. Moreover, CAPs are seed-based, i.e., frames to be clustered are selected based on high activity levels of a single node rather than simultaneous high activity across all nodes. For example, choosing the posterior cingulate cortex (PCC) as the seed, Liu and Duyn12 observed a strong correlation between the BOLD activity at times where PCC is highly active and the corresponding seed-based FC map.

The static Gaussian null model employed in this study can predict this strong correlation, which is the fundamental conceptual and practical property guiding the selection of frames that are subsequently clustered into CAPs. As shown in Eq. (32) of the ‘Methods’ section, frames selected based on the high activity of a seed node are expected to exhibit BOLD activity patterns that reflect the corresponding column of the nFC. Furthermore, the correlation is directly proportional to the activity level and thus peaks at frames with the highest activity of the seed. This relationship is illustrated in (Fig. 6a), where frames are sorted in descending order based on the seed node activity. It is important to note that parcels are used instead of voxels, in alignment with the rest of the simulations herein; however, the mathematical results hold true in both cases. What happens when more frames are averaged? The answer is shown in Fig. 6b and follows intuitively from Fig. 6a. Let us again consider the frames sorted in descending order; as the activity threshold is lowered (i.e., set to a lower percentile), more frames are averaged. The first frames are best aligned with the seed FC vector (i.e., the nFC column corresponding to the seed node) and their average quickly converges to the same direction. The alignment (or similarity) reaches a plateau as the middle frames are added to the average since they have low or zero correlation with the seed FC vector. Finally, as the last and most negatively-correlated frames are added to the average, the similarity drops sharply. This explains and replicates the first findings of the seminal CAPs paper by Liu and Duyn12.

Fig. 6: The static Gaussian null model can explain the spatial similarity between high-amplitude frames based on a seed node and the corresponding correlation map—a core feature of coactivation patterns (CAPs).
figure 6

a For each seed node (i.e. parcel), the frames are sorted in descending order based on the BOLD activity of the seed (horizontal axis). The Pearson correlation is computed between each frame and the seed-based correlation map (i.e., the column of the nFC matrix corresponding to the seed). The curves represent the average similarity over all 200 seed nodes and over 100 HCP subjects. As predicted analytically, the similarity is proportional to the BOLD activity and it decreases as lower-amplitude frames are considered (lower percentiles). The null model is in excellent agreement with the empirical results. b Same as in panel (a), but the BOLD activity of all the frames above a given threshold (percentile) are averaged before computing the correlation with the seed-based FC map. Only a small fraction of high-amplitude frames is required to explain most of the nFC variance, reproducing a well-known result in the CAPs literature12.

To summarise, analytic predictions in the case of CAPs differ from the edge-centric case in that the high-amplitude frames are best explained by individual nFC columns rather than its eigenvectors. However, there is an intuitive link between the two: if we select frames where a specific node is highly active (CAPs approach), the BOLD signal will most likely resemble the corresponding nFC column; if we select frames where all nodes are highly active (edge-centric approach), the BOLD signal will most likely resemble all the nFC columns, i.e., it will align with the leading nFC eigenvector.

Discussion

We have presented mathematical proofs and numerical analyses of real HCP data supporting our claim that the static nFC is sufficient to replicate the main resting-state edge-centric findings in refs. 16,17 both qualitatively and quantitatively, without relying on the edge time series nor any temporal correlations. Specifically, the eFC, the edge communities, the edge time series norm (RSS) distribution, and the spatial BOLD patterns underpinning large cofluctuations can all be predicted from the nFC under the null hypothesis of i.i.d. multivariate Gaussian variables. As further shown, key properties of binary edge time series32 and CAPs12 can similarly be predicted. The inability to reject the null hypothesis on most of the HCP 100 unrelated subjects does not support the conclusion that these edge-centric metrics provide additional information beyond the nFC. These results are not an attempt to disprove the existence of finely timed neural events—they just warn that the evidence provided by fMRI data may not be sufficient to reject simpler explanations, based on the established nFC literature, for the edge-centric features studied in refs. 16,17. In fact, previous influential studies have raised similar warnings in the context of sliding-window approaches to time-varying FC19,21,22.

However, it would be premature to conclude that the nascent edge-centric approach has no merit, and we acknowledge the fast progress in its development and applications at the time of writing18,26,32. Particularly interesting is the influence of structural modules on the edge cofluctuations26, which we briefly address in the ‘Methods’ section. The size of the functional modules shapes the spectrum of the nFC: larger functional modules allow for larger eigenvalues, which underpin the high-amplitude cofluctuations. This offers a mathematical insight into the relationship between modular structure and large cofluctuations, and why the latter disappear if the modular structure is disrupted. While fully addressing the latest (and increasingly large number of) preprints is beyond the scope of this work, it is possible that model-based approaches will reveal the role of edge-centric properties in bridging brain structure and function. Indeed, temporally-unfolded (or point-wise) dependence measures have been instrumental in studying the structure-function relationship in canonical complex systems34,35; seeing the edge time series as point-wise mutual information under the Gaussian assumption could create new links to the existing literature.

It would also be unreasonable to assume that the null hypothesis of i.i.d. variables is a good description of the BOLD signal, which is slowly-varying and highly autocorrelated. Thus, the fact that such null model is able to replicate the edge-centric features in refs. 16,17 is an indication that the temporal structure of the edge time series has not been fully exploited. Indeed, besides the notable cases of the synchronisation of the cofluctuations across subjects watching the same movies and the peak-to-peak interval distribution (see Fig. 6), most of the proposed edge-centric metrics are unaffected by time shuffling (i.e., they are static measures) and can thus be replicated by the i.i.d. null model. Our mathematical derivations do not directly apply to sliding-window approaches since windowed correlations are changed by time shuffling, but it is important to remember that many common time-varying FC analysis pipelines have intermediate steps that alternately leverage and neglect temporal ordering11. For example, one might estimate sliding-window correlations (dynamic stage), apply k-means clustering to the resulting time-resolved FC matrices (static stage, since k-means ignores the temporal ordering of the windows), and then evaluate state properties such as dwell times and transition probabilities (dynamic stage). While dynamic measures cannot be predicted from the static nFC, it is not unlikely that static null models could reproduce some of the static measures involved, e.g. the state patterns found via k-means, but not their transition probabilities.

Let us note, however, that the role of null models in time-varying FC is a matter of current debate19,36, and not all features that can be explained by null models are clinically irrelevant or to be dismissed. For example, using a small fraction of high-amplitude frames to approximate the nFC has been suggested as a way to compress the BOLD signal and alleviate the computational burden of analysing large fMRI datasets without compromising the prediction accuracy15. Future studies may find other useful criteria for filtering the frames using the edge time series. As a final contribution, we have analytically shown that the predominance of a few frames in shaping the nFC is to be expected: the nFC captures the BOLD signal variance, which is a second-order statistic and heavy-tailed (even if the BOLD signal were Gaussian). Therefore, the nFC is necessarily shaped by a few tail events corresponding to large amplitude frames. These frames determine the direction of maximum variance and thus the first principal components of the BOLD signal, forming the leading eigenvectors of the nFC.

In conclusion, we have laid out the mathematical foundations for the edge-centric FC analysis with the goal of informing future studies, in an interplay with empirical observations and simulations. Future work could leverage this mathematical framework and focus on dynamic measures that cannot be easily explained by minimal null models like the one presented herein.

Methods

Definition of edge-centric FC

Functional connectivity is defined as the magnitude of the statistical dependence between pairs of brain parcels37. This dependence is typically estimated from their time series (here, the BOLD signal) using the Pearson correlation coefficient. Let N be the number of parcels, T be the number of recorded frames, and xi = [xi(1), …, xi(T)] be the time series recorded from parcel i, with 1 ≤ i ≤ N. The correlation between two parcels i and j can be computed as \({r}_{ij}=\frac{1}{T-1}{\sum }_{t}{z}_{i}(t){z}_{j}(t)\), where zi and zj are their z-scored time series row vectors, i.e., \({{{{{{{{\bf{z}}}}}}}}}_{i}=\frac{{{{{{{{{\bf{x}}}}}}}}}_{i}-{\mu }_{i}}{{\sigma }_{i}}\) (with μi and σi indicating the time-averaged mean and standard deviation). Repeating this procedure for all pairs of parcels results in a node-by-node (N × N) correlation matrix R = [rij], which is an estimate of the (node-centric) functional connectivity.

The edge time series between two parcels i and j is the vector resulting from the element-wise product of zi and zj, which encodes the magnitude of their cofluctuations over time:

$${c}_{ij}(t):={z}_{i}(t){z}_{j}(t).$$
(6)

We will denote the random variable (RV) associated with the edge time series cij(t) with a capital letter, i.e., Cij(t). The column vector of all the N2 edge time series values at a given time t can be reshaped into a N × N matrix that is an instantaneous estimate of the dynamic functional connectivity based on a single frame. This matrix is symmetric since each edge time series is independent of the order of the node pair it involves, i.e., cij(t) = cji(t), i, j. For this reason, only the upper-triangular portion is typically computed, for a total of N(N − 1)/2 edge time series instead of N2.

It is also possible to go one step further and estimate the statistical dependence between each pair of edge time series, where each edge corresponds to a pair of parcels. This fourth-order statistic results in a large \(\frac{N(N-1)}{2}\times \frac{N(N-1)}{2}\) matrix named edge functional connectivity (eFC)17, with normalised entries defined as

$${{{{{{{{\rm{eFC}}}}}}}}}_{jk,lm}:=\frac{{\sum }_{t}{c}_{jk}(t)\,{c}_{lm}(t)}{\sqrt{{\sum }_{t}{c}_{jk}{(t)}^{2}}\sqrt{{\sum }_{t}{c}_{lm}{(t)}^{2}}}.$$
(7)

Null hypothesis

Let z be the N × T (parcels × frames) matrix of z-scored BOLD observations. Since our goal is to derive the (dynamic) edge-centric properties from the (static) nFC matrix (R = [rij]), we need to define a null hypothesis that discounts any temporal dependencies but retains the observed spatial correlations in R. A simple null hypothesis on the distribution of (the columns of) z that satisfies this criterion is \({{{{{{{\bf{Z}}}}}}}}(t) \sim {{{{{{{\mathcal{N}}}}}}}}(0,{{{{{{{\bf{R}}}}}}}})\), that is, i.i.d. multivariate Gaussian RV with a covariance matrix R matching the observed nFC. If we denote the state of the system at time t as the column vector \({{{{{{{\bf{z}}}}}}}}(t)={[{z}_{1}(t),\ldots ,{z}_{N}(t)]}^{\top }\), the null hypothesis simply states that z(t) is drawn from the same multivariate Gaussian distribution at each time t, independently of the other samples.

Derivation of edge FC

With capital letters denoting RVs, the expected product between two edge time series Cjk(t) and Clm(t) at time t is

$${\mathbb{E}}[{C}_{jk}(t){C}_{lm}(t)]= \; {\mathbb{E}}[{Z}_{j}(t){Z}_{k}(t){Z}_{l}(t){Z}_{m}(t)]\\ = \; \kappa ({Z}_{j}(t){Z}_{k}(t){Z}_{l}(t){Z}_{m}(t))\\ +\,{\mathbb{E}}[{Z}_{j}(t){Z}_{k}(t)]{\mathbb{E}}[{Z}_{l}(t){Z}_{m}(t)]\\ +\,{\mathbb{E}}[{Z}_{j}(t){Z}_{l}(t)]{\mathbb{E}}[{Z}_{k}(t){Z}_{m}(t)]\\ +\,{\mathbb{E}}[{Z}_{j}(t){Z}_{m}(t)]{\mathbb{E}}[{Z}_{k}(t){Z}_{l}(t)],$$
(8)

which follows from the definition of the joint cumulant κ(Zj(t)Zk(t)Zl(t)Zm(t)), also noting that the products involving the expectation of a single variable are equal to zero (i.e., \({\mathbb{E}}[{Z}_{i}(t)]=0\)) since zi are z-scored. The expression of the expectation in terms of joint cumulants is sometimes referred to as moment-cumulants formula38,39. The joint cumulant is equal to zero for Gaussian RVs39, allowing a simplification of Eq. (8) known as Isserlis’ theorem40:

$${\mathbb{E}}[{C}_{jk}(t){C}_{lm}(t)]= \; {\mathbb{E}}[{Z}_{j}(t){Z}_{k}(t)]{\mathbb{E}}[{Z}_{l}(t){Z}_{m}(t)]\\ \; +\,{\mathbb{E}}[{Z}_{j}(t){Z}_{l}(t)]{\mathbb{E}}[{Z}_{k}(t){Z}_{m}(t)]\\ \; +\,{\mathbb{E}}[{Z}_{j}(t){Z}_{m}(t)]{\mathbb{E}}[{Z}_{k}(t){Z}_{l}(t)].$$
(9)

If we additionally assume that Z(t) is ergodic (which does not preclude it from being a multivariate autoregressive process19), all the involved terms become independent of time and Eq. (9) further simplifies to

$${\mathbb{E}}[{C}_{jk}(t){C}_{lm}(t)]={r}_{jk}{r}_{lm}+{r}_{jl}{r}_{km}+{r}_{jm}{r}_{kl}.$$
(10)

In a mathematical sense, the expectation is to be intended over the population (ensemble), whereas the eFC is computed from a single session (sample). However, the ergodic assumption guarantees that sample estimates converge to the ensemble expectation as the number of time frames increases. We can then obtain an estimate of the eFC by substituting Eq. (10) into Eq. (7):

$${{{{{{{{\rm{eFC}}}}}}}}}_{jk,lm}=\frac{{r}_{jk}{r}_{lm}+{r}_{jl}{r}_{km}+{r}_{jm}{r}_{kl}}{\sqrt{1+2{{r}_{jk}}^{2}}\sqrt{1+2{{r}_{lm}}^{2}}}.$$
(11)

The i.i.d. multivariate Gaussian null model clearly satisfies the ergodic hypothesis since it has no memory. However, note that neither the i.i.d. nor the Gaussian properties are necessary since the derivations in this section assume ergodicity with the only additional constraint that the joint cumulant is equal to zero.

Derivation of edge communities

In order to formalise the intuition that two edges (jk) and \((j^{\prime} k^{\prime} )\) with similar rows of the eFC are likely to be clustered together, let us define their distance \({d}_{jk,j^{\prime} k^{\prime} }\) as the 1 norm of the difference between the corresponding rows of the (unnormalised) eFC:

$${d}_{jk,j^{\prime} k^{\prime} }:= \; \mathop{\sum }\limits_{l,m=1}^{N}\frac{1}{T}| {\mathbb{E}}\left[{C}_{jk}{C}_{lm}^{\top }\right]-{\mathbb{E}}\left[{C}_{j^{\prime} k^{\prime} }{C}_{lm}^{\top }\right]| \\ = \; \mathop{\sum }\limits_{l,m=1}^{N}| {r}_{jk}{r}_{lm}+{r}_{jl}{r}_{km}+{r}_{jm}{r}_{kl}\\ -{r}_{{j}^{\prime}{k}^{\prime}}{r}_{lm}-{r}_{{j}^{\prime}l}{r}_{{k}^{\prime}m}-{r}_{{j}^{\prime}m}{r}_{{k}^{\prime}l}| \\ = \; \mathop{\sum }\limits_{l,m=1}^{N}| {{{{{{{{\bf{z}}}}}}}}}_{j}{{{{{{{{\bf{z}}}}}}}}}_{k}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{l}{{{{{{{{\bf{z}}}}}}}}}_{m}^{\top }+{{{{{{{{\bf{z}}}}}}}}}_{j}{{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{k}{{{{{{{{\bf{z}}}}}}}}}_{m}^{\top }+{{{{{{{{\bf{z}}}}}}}}}_{j}{{{{{{{{\bf{z}}}}}}}}}_{m}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{k}{{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }\\ -\,{{{{{{{{\bf{z}}}}}}}}}_{{j}^{\prime}}{{{{{{{{\bf{z}}}}}}}}}_{{k}^{\prime}}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{l}{{{{{{{{\bf{z}}}}}}}}}_{m}^{\top }-{{{{{{{{\bf{z}}}}}}}}}_{{j}^{\prime}}{{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{{k}^{\prime}}{{{{{{{{\bf{z}}}}}}}}}_{m}^{\top }-{{{{{{{{\bf{z}}}}}}}}}_{{j}^{\prime}}{{{{{{{{\bf{z}}}}}}}}}_{m}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{{k}^{\prime}}{{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }| \\ = \; \mathop{\sum }\limits_{l,m=1}^{N}3| {{{{{{{{\bf{z}}}}}}}}}_{j}({{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{m}){{{{{{{{\bf{z}}}}}}}}}_{k}^{\top }-{{{{{{{{\bf{z}}}}}}}}}_{{j}^{\prime}}({{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{m}){{{{{{{{\bf{z}}}}}}}}}_{{k}^{\prime}}^{\top }| .$$
(12)

We now have the necessary ingredients to build a measure of similarity between the nodes, which can be used to predict the edge cluster similarity in ref. 17. There, the similarity between two nodes is measured as the frequency with which the corresponding edges are clustered together (having fixed the number of communities to 10). Instead of discrete assignments to 10 communities, Eq. (12) provides a continuous measure of the distance between two edges. The distance between two nodes i and j can then be defined as the sum of the distances between the edges starting from i and j  and reaching the same target:

$${d}_{i,j} = \mathop{\sum }\limits_{k=1}^{N}{d}_{ik,jk}\\ = \mathop{\sum}\limits_{k,l,m}3| ({{{{{{{{\bf{z}}}}}}}}}_{i}-{{{{{{{{\bf{z}}}}}}}}}_{j})({{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{m}){{{{{{{{\bf{z}}}}}}}}}_{k}^{\top }| \\ \le \left\Vert ({{{{{{{{\bf{z}}}}}}}}}_{i}-{{{{{{{{\bf{z}}}}}}}}}_{j})\right\Vert \mathop{\sum}\limits_{k,l,m}3\left\Vert ({{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{m}){{{{{{{{\bf{z}}}}}}}}}_{k}^{\top }\right\Vert \\ = {\left(\mathop{\sum}\limits_{t}{({z}_{i}(t)-{z}_{j}(t))}^{2}\right)}^{\frac{1}{2}}\mathop{\sum}\limits_{k,l,m}3\left\Vert ({{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{m}){{{{{{{{\bf{z}}}}}}}}}_{k}^{\top }\right\Vert \\ = {\left((T-1)({{{{{{{\rm{Var}}}}}}}}[{{{{{{{{\bf{z}}}}}}}}}_{i}]+{{{{{{{\rm{Var}}}}}}}}[{{{{{{{{\bf{z}}}}}}}}}_{j}]-2{r}_{ij})\right)}^{\frac{1}{2}}\,\mathop{\sum}\limits_{k,l,m}\,3\left\Vert ({{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{m}){{{{{{{{\bf{z}}}}}}}}}_{k}^{\top }\right\Vert \\ = {\left(1-{r}_{ij}\right)}^{\frac{1}{2}}\left[{(2(T-1))}^{\frac{1}{2}}\mathop{\sum}\limits_{k,l,m}3\left\Vert ({{{{{{{{\bf{z}}}}}}}}}_{l}^{\top }{{{{{{{{\bf{z}}}}}}}}}_{m}){{{{{{{{\bf{z}}}}}}}}}_{k}^{\top }\right\Vert \right]\\ \propto {\left(1-{r}_{ij}\right)}^{\frac{1}{2}},$$
(13)

where we used the Cauchy-Schwarz inequality and noted that the terms in the square brackets form a constant (independent of i and j). It is then apparent that the edge-cluster similarity17 between nodes i and j can be approximated by the nFC. Once again, note that the i.i.d. Gaussian RV assumption can be relaxed since the derivations in this section are based on Eq. (10), which requires ergodicity with the only additional constraint that the joint cumulant is equal to zero.

Derivation of RSS from the BOLD signal

Recalling the definition of the edge time series cij(t) in Eq. (6), the RSS defined in ref. 16 can be approximated as the squared Euclidean norm of the z-scored BOLD signal, up to a constant factor:

$${{{{{{{\rm{RSS}}}}}}}}(t): =\; \sqrt{\mathop{\sum }\limits_{i\,{ < }\,j}{{c}_{ij}(t)}^{2}}\\ =\; \sqrt{\frac{1}{2}\left(\mathop{\sum }\limits_{i,j=1}^{N}{{c}_{ij}(t)}^{2}-\mathop{\sum }\limits_{i=1}^{N}{{c}_{ii}(t)}^{2}\right)}\\ =\; \sqrt{\frac{1}{2}\left(\mathop{\sum }\limits_{i,j=1}^{N}{z}_{i}{(t)}^{2}{z}_{j}{(t)}^{2}-\mathop{\sum }\limits_{i=1}^{N}{z}_{i}{(t)}^{4}\right)}\\ =\; \sqrt{\frac{1}{2}\left({\left\Vert {{{{{{{\bf{z}}}}}}}}(t)\right\Vert }^{4}-\mathop{\sum }\limits_{i=1}^{N}{z}_{i}{(t)}^{4}\right)}\\ \approx \; \frac{1}{\sqrt{2}}{\left\Vert {{{{{{{\bf{z}}}}}}}}(t)\right\Vert }^{2}.$$
(14)

The approximation does not rely on the i.i.d. assumption; it is valid under the Gaussian null hypothesis and, more generally, for distributions with finite kurtosis—including fMRI data. Under this assumption, \({\left\Vert {{{{{{{\bf{z}}}}}}}}(t)\right\Vert }^{4}\) dominates ∑izi(t)4 in Eq. (14), as can be seen from the ratio of their (expected) values:

$$\frac{{\mathbb{E}}\left[\mathop{\sum}\limits_{i}{Z}_{i}{(t)}^{4}\right]}{{\mathbb{E}}\left[{\left\Vert {{{{{{{\bf{Z}}}}}}}}(t)\right\Vert }^{4}\right]}\le \frac{{\sum }_{i}{{{{{{{\rm{Kurt}}}}}}}}[{Z}_{i}(t)]}{{N}^{2}}\mathop{\longrightarrow }\limits_{N\to \infty }0.$$
(15)

The approximation in Eq. (14) can be replaced by an exact equality if all the N2 edge time series are included in the RSS definition (that is, all the (i, j) tuples, rather than only the pairs with i < j):

$${{{{{{{{\rm{RSS}}}}}}}}}_{{{{{{{{\rm{all}}}}}}}}}(t):=\sqrt{\mathop{\sum}\limits_{i,j}{{c}_{ij}(t)}^{2}}={\left\Vert {{{{{{{\bf{z}}}}}}}}(t)\right\Vert }^{2}.$$
(16)

Why are frames with the largest RSS most similar to the nFC?

Having rewritten the RSS as the squared Euclidean norm in Eq. (14), we can more easily investigate the conditions underpinning the largest RSS fluctuations. Let us withen the BOLD vector Z(t) to obtain the RV

$${{{{{{{\bf{W}}}}}}}}(t):={{{{{{{{\bf{R}}}}}}}}}^{-\frac{1}{2}}{{{{{{{\bf{Z}}}}}}}}(t)$$
(17)

and let

$${{{{{{{\bf{R}}}}}}}}={{{{{{{\bf{U}}}}}}}}{{{{{{{\boldsymbol{\Lambda }}}}}}}}{{{{{{{{\bf{U}}}}}}}}}^{\top }$$
(18)

be the eigendecomposition of the correlation matrix R, with Λ = diag(λ1, …, λN) and U being the unitary matrix of eigenvectors of R (since R is symmetric). Without loss of generality, assume that the eigenvalues are sorted in descending order, such that λ1 is the largest eigenvalue and u1 is the corresponding leading eigenvector. The RSS can be treated as a RV and rewritten in terms of W(t) and the eigenvector matrix U:

$${{{{{\mathrm{RSS}}}}}}(t) \approx \ \frac{1}{\sqrt{2}}{\left\| {{{{{\mathbf{Z}}}}}}(t)\right\|}^2 = \frac{1}{\sqrt{2}} {{{{{\mathbf{Z}}}}}}(t)^{\top} {{{{{\mathbf{Z}}}}}}(t) \\ = \ \frac{1}{\sqrt{2}} {{{{{\mathbf{W}}}}}}(t)^{\top} ({{{{{\mathbf{R}}}}}}^{\frac{1}{2}} )^{\top} \left({{{{{\mathbf{R}}}}}}^{\frac{1}{2}} \right) {{{{{\mathbf{W}}}}}}(t) \\ = \ \frac{1}{\sqrt{2}} ({{{{{\mathbf{W}}}}}}(t)^{\top} {{{{{\mathbf{U}}}}}} ){{{{{{\mathbf{\Lambda}}}}}}}({{{{{\mathbf{U}}}}}}^{\top} {{{{{\mathbf{W}}}}}}(t)) \\ = \ \frac{1}{\sqrt{2}} \sum\limits_i \lambda_i \left[({{{{{\mathbf{U}}}}}}^{\top} {{{{{\mathbf{W}}}}}}(t))_i \right]^2 \\ = \ \frac{1}{\sqrt{2}} \sum\limits_i \lambda_i \left\langle {{{{{\mathbf{u}}}}}}_i,{{{{{\mathbf{W}}}}}}(t) \right\rangle^2$$
(19)
$$\kern3.5pc =\frac{1}{\sqrt{2}}\mathop{\sum}\limits_{i}{\lambda }_{i}{\left\Vert {{{{{{{{\bf{u}}}}}}}}}_{i}\right\Vert }^{2}{\left\Vert {{{{{{{\bf{W}}}}}}}}(t)\right\Vert }^{2}{\cos }^{2}{{{\Theta }}}_{i}(t),$$
(20)

where ui is the i-th eigenvector and Θi(t) is the RV representing the angle formed by the vectors ui and W(t) at time t. Also note that \({\left\Vert {{{{{{{{\bf{u}}}}}}}}}_{i}\right\Vert }^{2}=1\) because U is unitary. For any realisations w(t) with squared norm \({\left\Vert {{{{{{{\bf{w}}}}}}}}(t)\right\Vert }^{2}\), an upper bound on the RSS is obtained as

$${{{{{{{\rm{RSS}}}}}}}}(t) \le \frac{1}{\sqrt{2}}{\left\Vert {{{{{{{\bf{w}}}}}}}}(t)\right\Vert }^{2}\mathop{\max }\limits_{i}{\lambda }_{i}\mathop{\sum}\limits_{i}{\cos }^{2}{\theta }_{i}(t)\\ = \frac{1}{\sqrt{2}}{\lambda }_{\max }{\left\Vert {{{{{{{\bf{w}}}}}}}}(t)\right\Vert }^{2},$$
(21)

noting that

$$\mathop{\sum}\limits_{i}{\cos }^{2}{\theta }_{i}(t)=\mathop{\sum}\limits_{i}\frac{{\langle {{{{{{{{\bf{u}}}}}}}}}_{i},{{{{{{{\bf{w}}}}}}}}(t)\rangle }^{2}}{{\left\Vert {{{{{{{{\bf{u}}}}}}}}}_{i}\right\Vert }^{2}{\left\Vert {{{{{{{\bf{w}}}}}}}}(t)\right\Vert }^{2}}=\frac{{\left\Vert {{{{{{{\bf{U}}}}}}}}{{{{{{{\bf{w}}}}}}}}(t)\right\Vert }^{2}}{{\left\Vert {{{{{{{\bf{w}}}}}}}}(t)\right\Vert }^{2}}=1.$$
(22)

The upper bound is reached when \({\theta }_{1}(t^{\prime} )=0\) or \({\theta }_{1}(t^{\prime} )=\pi\), which implies that \({{{{{{{\bf{w}}}}}}}}(t^{\prime} )=c\,{{{{{{{{\bf{u}}}}}}}}}_{1}\), where c is a constant. In other words, \({{{{{{{\bf{w}}}}}}}}(t^{\prime} )\) is aligned (parallel or antiparallel) with the leading eigenvector u1. When this happens, the BOLD signal vector \({{{{{{{\bf{z}}}}}}}}(t^{\prime} )\) must also be aligned with u1:

$${{{{{{{\bf{z}}}}}}}}(t^{\prime} )= \; {{{{{{{{\bf{R}}}}}}}}}^{\frac{1}{2}}{{{{{{{\bf{w}}}}}}}}(t^{\prime} )={{{{{{{{\bf{R}}}}}}}}}^{\frac{1}{2}}c\,{{{{{{{{\bf{u}}}}}}}}}_{1}\\ = \; {{{{{{{\bf{U}}}}}}}}{{{{{{{{\boldsymbol{\Lambda }}}}}}}}}^{\frac{1}{2}}{{{{{{{{\bf{U}}}}}}}}}^{\top }{{{{{{{{\bf{u}}}}}}}}}_{1}=c\,{\lambda }_{1}^{\frac{1}{2}}{{{{{{{{\bf{u}}}}}}}}}_{1}.$$
(23)

We can then refine our theoretical understanding of the RSS peaks: not only they occur when the Euclidean norm of the BOLD signal vector is large (as per Eq. (14)) but, most likely, when it is well aligned with the leading eigenvector of the static nFC (see Fig. 5b). If the alignment were perfect at a frame \(t^{\prime}\), the instantaneous estimate of the nFC would be

$${{{{{{{\bf{z}}}}}}}}(t^{\prime} ){{{{{{{\bf{z}}}}}}}}{(t^{\prime} )}^{\top }={c}^{2}{\lambda }_{1}{{{{{{{{\bf{u}}}}}}}}}_{1}{{{{{{{{\bf{u}}}}}}}}}_{1}^{\top },$$
(24)

i.e., an approximation of the nFC obtained from its leading eigenvector only (and independent of the sign of c). This approximation would achieve an average similarity (Pearson correlation coefficient) of r = 0.69 with the nFC over 100 unrelated participants of the HCP dataset (while, in practice, the highest similarity achieved by the top frame was r = 0.53). However, the alignment with u1 need not be perfect: in general, large RSS values can be expected whenever the BOLD signal is a mixture of the top eigenvectors. Additional principal components are expressed as more large-RSS frames are averaged, suggesting why the top 5% frames alone are sufficient for an almost perfect reconstruction of the nFC (Fig. 3). We have thus explained why cofluctuation patterns corresponding to frames with the largest RSS exhibit the highest similarity with the nFC. These results are based on Eq. (14) and hold true under the assumption of finite kurtosis (which also applies in the specific case of the null hypothesis, i.e., for Gaussian variables). The i.i.d. assumption is not required.

Null distribution of the RSS

The RSS can be written as a simple quadratic form \({{{{{{{\rm{RSS}}}}}}}}(t)=\frac{1}{\sqrt{2}}{\left\Vert {{{{{{{\bf{Z}}}}}}}}(t)\right\Vert }^{2}=\frac{1}{\sqrt{2}}{{{{{{{\bf{Z}}}}}}}}{(t)}^{\top }{{{{{{{\bf{Z}}}}}}}}(t)\), which is known to follow a generalised χ2 distribution under the null hypothesis of Gaussian variables41. The weights of the non-central chi-square components are proportional to the eigenvalues of the nFC matrix, i.e., \(\frac{{\lambda }_{1}}{\sqrt{2}},\ldots ,\frac{{\lambda }_{N}}{\sqrt{2}}\). Another characterisation of this distribution is provided by Eq. (19): under the null hypothesis, the inner product 〈ui, W(t)〉 follows a normal Gaussian distribution since \({{{{{{{\bf{W}}}}}}}}(t) \sim {{{{{{{\mathcal{N}}}}}}}}(0,1)\) and U is unitary. Therefore, \({\langle {{{{{{{{\bf{u}}}}}}}}}_{i},{{{{{{{\bf{W}}}}}}}}(t)\rangle }^{2}\) follows a χ2 distribution and each term \(\frac{{\lambda }_{i}}{\sqrt{2}}{\langle {{{{{{{{\bf{u}}}}}}}}}_{i},{{{{{{{\bf{W}}}}}}}}(t)\rangle }^{2}\) in Eq. (19) follows a Gamma (\(k=\frac{1}{2},\theta =\sqrt{2}{\lambda }_{i}\)) distribution. The RSS is thus obtained as a sum of N independent Gamma-distributed RVs, each associated with one eigenvalue of the nFC. The tail of the RSS is best approximated by the RVs associated with the largest eigenvalues (which have the largest mean and variance), while including smaller eigenvalues provides an increasingly fuller characterisation of the whole distribution (Fig. 4b). The mean and variance of the RSS can be readily obtained from the properties of the Gamma distribution:

$${\mathbb{E}}[{{{{{{{\rm{RSS}}}}}}}}]=\frac{1}{\sqrt{2}}\mathop{\sum}\limits_{i}{\lambda }_{i}=\frac{N}{\sqrt{2}}$$
(25)
$${{{{{{{\rm{Var}}}}}}}}[{{{{{{{\rm{RSS}}}}}}}}]=\mathop{\sum}\limits_{i}{\lambda }_{i}^{2}.$$
(26)

Higher moments of the RSS null distribution can be derived from its moment-generating function:

$${M}_{{{{{{{{\rm{RSS}}}}}}}}}(s)=\mathop{\prod}\limits_{i}{(1-\sqrt{2}{\lambda }_{i}s)}^{-\frac{1}{2}}.$$
(27)

Why are large RSS fluctuations present in many datasets?

The moment-generating function in Eq. (27) can be employed to show that the RSS is subexponential under the null hypothesis, which explains its heavy tail and the consequent large events25,42. Specifically, the subexponential feature of the null RSS follows from the sufficient condition

$${M}_{{{{{{{{\rm{RSS}}}}}}}}-{\mathbb{E}}[{{{{{{{\rm{RSS}}}}}}}}]}(s) =\; \mathop{\prod}\limits_{i}{(1-\sqrt{2}{\lambda }_{i}s)}^{-\frac{1}{2}}\exp -\frac{{\lambda }_{i}s}{\sqrt{2}}\\ \le \mathop{\prod}\limits_{i}\exp {\lambda }_{i}^{2}{s}^{2}=\exp {s}^{2}\mathop{\sum}\limits_{i}{\lambda }_{i}^{2},\\ \quad\ \forall | s| \le {(4{\lambda }_{\max })}^{-1}.$$
(28)

However, we can expect this behaviour under the more general hypothesis that the z-scored BOLD signal is sub-Gaussian, i.e., its tail decays at least as fast as that of a Gaussian RV (including, for example, any uniformly-bounded RVs). The reason is that the square of a sub-Gaussian RV is subexponential, and the sum of independent subexponential RVs is also subexponential. Therefore, being the RSS closely approximated by a sum of squared RVs (as per Eq. (14)), extreme events are to be expected under the general sub-Gaussian assumption for the BOLD signal, which offers an explanation for the large RSS fluctuations observed in most fMRI datasets.

How do nFC modules influence the edge cofluctuations?

Interestingly, Pope et al.26 have recently reported a connection between the presence of structural modules and the occurrence of large events in the edge cofluctuations (RSS). Insofar as structural and functional modules are in agreement43,44, we can explain these findings based on the nFC spectrum. How do functional modules shape the eigenspectrum of the nFC? In the ideal case of a block-diagonal matrix (with zeroes outside the blocks), the sum of the eigenvalues corresponding to each block coincides with the block size (since the diagonal elements are all ones and the trace is preserved under diagonalisation). As such, the largest eigenvalue is bounded by the size of the largest block, i.e., larger functional modules allow for larger eigenvalues. In turn, large eigenvalues underpin the high-amplitude cofluctuations, as shown in the ‘Why are frames with the largest RSS most similar to the nFC?’ section. Therefore, if the size of the modules is reduced via randomisation of the structural connectivity as in ref. 26(SI Fig. 3), the expected magnitude of the RSS cofluctuations will drop according to Eq. (21). This offers a mathematical explanation for the lower RSS event count when the modular structure is disrupted.

A null model for binary edge time series

Consider two parcels j and k with z-scored BOLD activity Zj(t) and Zk(t) and correlation coefficient rjk as defined by the nFC matrix. Under the static Gaussian null hypothesis, \({Z}_{k}(t) \sim {{{{{{{\mathcal{N}}}}}}}}(0,1)\) and

$${Z}_{j}(t)={r}_{kj}{Z}_{k}(t)+\sqrt{1-{r}_{kj}^{2}}{V}_{j}\,,$$
(29)

where \({V}_{j} \sim {{{{{{{\mathcal{N}}}}}}}}(0,1)\), and the square root coefficient ensures unit variance for Zj(t) since both parcels are z-scored. The edge time series corresponding to the two parcels is Cjk(t) = Zj(t)Zk(t), as per Eq. (6). By definition, the binary edge time series \(\overline{{C}_{jk}(t)}\) is equal to 1 when the cofluctuations between j and k are positive, i.e., when Zj(t) and Zk(t) have the same sign. In other words, \(\overline{{C}_{jk}(t)}\) is a Bernoulli(pjk) RV with probability

$${p}_{jk} =\; P[\overline{{C}_{jk}(t)} \, > \, 0]\\ =\; P[{Z}_{j}(t){Z}_{k}(t) \, > \, 0]\\ =\; 2P[({Z}_{k}(t) \, > \, 0)\cap ({Z}_{j}(t) \, > \, 0)]\\ =\; 2P\left[({Z}_{k}(t) \, > \, 0)\cap ({r}_{kj}{Z}_{k}(t)+\sqrt{1-{r}_{kj}^{2}}{V}_{j} \, > \, 0)\right]\\ =\; 2P\left[({Z}_{k}(t) \, > \, 0)\cap \left(\frac{{V}_{j}}{{Z}_{k}(t)} > -\frac{{r}_{kj}}{\sqrt{1-{r}_{kj}^{2}}}\right)\right]\\ =\; 2P[A],$$
(30)

where A is the event \(({Z}_{k}(t) \, > \,0)\cap \left(\frac{{V}_{j}}{{Z}_{k}(t)} > -\frac{{r}_{kj}}{\sqrt{1-{r}_{kj}^{2}}}\right)\). From a geometric perspective, A is satisfied by any vector (Zk, Vj) whose polar angle is between π/2 and \(\arctan (-\frac{{r}_{jk}}{\sqrt{1-{r}_{kj}^{2}}})\). Therefore,

$${p}_{jk} =\; 2P[A]\\ =\; 2\left(\frac{\frac{\pi }{2}-\arctan (-\frac{{r}_{jk}}{\sqrt{1-{r}_{kj}^{2}}})}{2\pi }\right)\\ =\; \frac{1}{2}+\frac{\arcsin ({r}_{jk})}{\pi }.$$
(31)

In conclusion, under the null hypothesis, the binary edge time series \(\overline{{C}_{jk}(t)}\) can be modelled as a Bernoulli(pjk) RV with \({p}_{jk}=\frac{1}{2}+\frac{\arcsin ({r}_{jk})}{\pi }\).

Relationship with coactivation patterns (CAPs)

Here, we will focus on explaining a well-known finding in the CAPs literature: choosing a seed node, Liu and Duyn12 observed a strong correlation between the BOLD activity at frames where the seed is highly active and the corresponding seed-based FC. Note that the correlation is measured between two N-dimensional vectors (where N is the number of nodes): the first vector is the BOLD signal and the second vector is the column of the nFC matrix that corresponds to the chosen seed node. In the following, we will show that the i.i.d. Gaussian null model can predict the observed correlation. Let k be the seed node and zk its z-scored BOLD time series. As usual, the associated random process is denoted with the capital letter Zk(t). If we condition on a specific value of the seed node, say \({z}_{k}^{* }:={Z}_{k}({t}^{* })\), the BOLD activity vector at the corresponding time t* is expected to align with the k-th column of the nFC (denoted as \({\hat{{{{{{{{\bf{R}}}}}}}}}}_{k}\)):

$${\mathbb{E}}[{{{{{{{\bf{Z}}}}}}}}({t}^{* })]={\mathbb{E}}[{{{{{{{\bf{Z}}}}}}}}(t)| {Z}_{k}(t)={z}_{k}^{* }]={z}_{k}^{* }\;{\hat{{{{{{{{\bf{R}}}}}}}}}}_{k}.$$
(32)

This is an elementary property of multivariate Gaussian RVs that follows directly from Eq. (29):

$${\mathbb{E}}[{Z}_{j}({t}^{* })] =\; {\mathbb{E}}[{Z}_{j}(t)| {Z}_{k}(t)={z}_{k}^{* }]\\ =\; {\mathbb{E}}\left[{r}_{kj}{Z}_{k}(t)+\sqrt{1-{r}_{kj}^{2}}{V}_{j}| {Z}_{k}(t)={z}_{k}^{* }\right]\\ =\; {z}_{k}^{* }\;{r}_{kj},\ \forall j=1,\ldots ,N.$$
(33)

Perhaps less intuitively, the conditional correlation between the BOLD vector Z(t*) and the k-th column of the sample nFC (\({\hat{{{{{{{{\bf{R}}}}}}}}}}_{k}\)) is also proportional to the seed value \({z}_{k}^{* }\). In order to prove it, let us first compute their expected inner product:

$${\mathbb{E}}[\langle {{{{{{{\bf{Z}}}}}}}}({t}^{* }),{\hat{{{{{{{{\bf{R}}}}}}}}}}_{k}\rangle ] =\; {\mathbb{E}}\left[\mathop{\sum }\limits_{j}^{N}{Z}_{j}({t}^{* })\frac{1}{T-1}\mathop{\sum }\limits_{t}^{T}{Z}_{j}(t){Z}_{k}(t)\right]\\ \approx\; \mathop{\sum}\limits_{j}{\mathbb{E}}\left[{Z}_{j}({t}^{* })\right]\frac{1}{T-1}\mathop{\sum}\limits_{t\ne {t}^{* }}{\mathbb{E}}\left[{Z}_{j}(t){Z}_{k}(t)\right]\\ =\; \mathop{\sum}\limits_{j}{z}_{k}^{* }{r}_{kj}\ \frac{1}{T-1}\mathop{\sum}\limits_{t\ne {t}^{* }}{r}_{kj}\\ =\; {z}_{k}^{* }\;{\left\Vert {{{{{{{{\bf{R}}}}}}}}}_{k}\right\Vert }^{2}.$$
(34)

Similarly, the covariance can be approximated as

$${{{{{{{\rm{Cov}}}}}}}}[{{{{{{{\bf{Z}}}}}}}}({t}^{* }),{\hat{{{{{{{{\bf{R}}}}}}}}}}_{k}] =\; {\mathbb{E}}\left[\frac{\langle {{{{{{{\bf{Z}}}}}}}}({t}^{* }),{\hat{{{{{{{{\bf{R}}}}}}}}}}_{k}\rangle }{N-1}-\frac{{\sum }_{j}{Z}_{j}({t}^{* }){\sum }_{l}{\hat{r}}_{kl}}{N(N-1)}\right]\\ \approx \; {z}_{k}^{* }\left(\frac{{\left\Vert {{{{{{{{\bf{R}}}}}}}}}_{k}\right\Vert }^{2}}{N}-\frac{{\sum }_{j\ne l}{r}_{kj}{r}_{kl}}{N(N-1)}\right)$$
(35)

The expected sample covariance is directly proportional to \({z}_{k}^{* }\) and thus peaks at frames with the highest activity of the seed node k. This remains true after normalising the covariance to obtain the correlation coefficient, as shown in Fig. 6a.

Human Connectome Project fMRI dataset

This study used openly-available and independently-acquired resting-state fMRI (rsfMRI) data from the Human Connectome Project (HCP) S1200 release45. In particular, we used the “100 unrelated subjects” dataset: a subset of 100 non-twins adult participants which were pre-selected by the HCP coordinators (54% female; mean age = 29.11 ± 3.67 years; age range, 22–36 years). The HCP study was approved by the Washington University Institutional Review Board, and informed consent was obtained from all participants. All subjects were scanned on a customized Siemens 3T “Connectome Skyra” with a 32-channel head coil, housed at Washington University in St. Louis. rsfMRI data were acquired in four runs of 15 min over a 2-day period, with eyes open and relaxed fixation on a projected bright cross-hair on a dark background (presented in a darkened room). Resting-state images were collected with the following parameters: gradient-echo EPI sequence, run duration = 14:33 min, TR = 720 ms, TE = 33.1 ms, flip angle = 52, FOV = 208 × 180 mm (RO x PE), matrix = 104 × 90 (RO x PE), slice thickness = 2 mm, 2-mm isotropic voxel resolution, multi-band factor = 8, echo spacing = 0.58 ms, BW = 2290 Hz/Px).

Preprocessing and ICA-FIX denoising

Functional images in the HCP dataset were minimally preprocessed according to the pipeline described in ref. 46. In short, the data were corrected for gradient distortion, susceptibility distortion and motion and then aligned to a corresponding T1-weighted image with one spline interpolation step. This volume was further corrected for intensity bias, normalised to a mean of 10,000, projected to the 32k_fs_LR mesh (excluding outliers), and aligned to a common space using a multi-modal surface registration.

In addition, the preprocessed rsfMRI data were cleaned of structured noise through a process that pairs independent component analysis (MELODIC) with FIX to automatically remove non-neural spatiotemporal components (trained on 25 hand-labelled HCP subjects). The FIX approach and initial results of classification accuracy are detailed in ref. 47, and the effects of the ICA + FIX cleanup (and optimal methods to remove the artefactual components from the data) are evaluated in detail in ref. 48. The cleaning pipeline is described more comprehensively in the HCP S1200 release reference manual (https://humanconnectome.org/study/hcp-young-adult/document/1200-subjects-data-release/) and the preprocessing and the cleaning scripts are openly available on Github (https://github.com/Washington-University/HCPpipelines/). The resulting ICA-FIX denoised rfMRI grayordinate surface time series are available as CIFTI files following the naming pattern: *REST1,2_LR,RL_Atlas_MSMAll_hp2000_clean.dtseries.nii.

The Schaefer200 parcellation was used to define 200 areas on the cerebral cortex49. This functional parcellation was designed to optimise both local gradient and global similarity measures of the fMRI signal and is openly available in ‘32k fs LR’ space for the HCP dataset. The nodes are mapped to the Yeo canonical functional networks50. The parcellated data were analysed both before and after regressing the global signal. The theoretical derivations and predictions hold and perform equally well in both cases, and we report any significant differences when they occur. Unless otherwise stated, the GSR results are shown in the figures since they are more directly comparable to those published in refs. 16,17, noting in particular that GSR was performed in ref. 16. Despite the ICA-FIX preprocessing pipeline used here is entirely different from those employed in refs. 16,17, our results are in excellent agreement with the previously published ones.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.