Introduction

There has been much recent interest in narrow Chern bands that spontaneously develop ferromagnetic order, following their appearance in a variety of Moiré materials1,2,3,4,5,6,7,8. The bands of magic-angle twisted bilayer graphene (MATBG) can, for example, be viewed as complementary Chern bands residing on opposite sublattices9,10,11,12, where spin, valley, and sublattice polarization leads to Chern insulators as observed in experiment5,13. The key question is, what is the nature of charge carriers associated with doping these generalized Chern ferromagnets? The answer will have implications for the entire phase diagram of various Moiré materials, and could hold the key to explaining mysteries such as the doping dependence and origin of superconductivity in MATBG and related structures. The simplest example of a Chern band, Landau levels, have previously been shown to exhibit a ferromagnetic ground state at unit filling14,15. Despite the simplicity of the ground state, charge excitations can be very nontrivial. In addition to single electron quasiparticles corresponding to adding an electron with a reversed spin, this system also hosts charged skyrmions—smooth textures of the ferromagnetic order that carry an electric charge proportional to their topological winding14,16. The smallest nontrivial limit of a skyrmion is a quasiparticle bound to a single-spin flip, i.e., a spin-polaron. (For a discussion of spin-polarons in non-topological ferromagnets, see e.g., ref. 17).

Upon passing from Landau levels to Chern bands, questions about nontrivial charge carriers become more subtle. On the one hand, these bands share the same topological character as Landau level and are thus expected to realize similar charge excitations. On the other hand, in comparison to Landau levels, Chern bands possess a rich reciprocal space structure, including band dispersion and variation of band geometric features such as Berry curvature over the Brillouin Zone. These features are not readily incorporated in the standard way of treating skyrmion excitations as real space textures. Furthermore, although skyrmions can be smoothly shrunk to electrons, the standard description of the two excitations could not be more different: the former is a real space texture whose energy is computed using effective field theory14,15,18,19,20,21 or real space variational methods22,23,24 while the latter is a momentum eigenstate whose energy is obtained from momentum space Hartree–Fock. One route to connecting real and momentum space descriptions begins with small skyrmions, and in particular spin polarons, which are more naturally described as electrons dressed by spin flips25 rather than as real space textures. The former picture furnishes a momentum space approach that can easily incorporate the reciprocal space properties of Chern insulators and can also facilitate the use of powerful diagrammatic techniques. However, even at the outset, this program poses a puzzle: how is information about band topology, necessary for electrically charged skyrmions, incorporated in the dressed electron picture, i.e., how does a localized excitation in momentum space detect the topology of the entire band?

In this work, we take the first step towards a momentum space characterization of nontrivial charge excitations in Chern ferromagnets by studying the formation of the smallest skyrmions, the spin polaron, consisting of an electron dressed by one spin flip. We show that the matrix elements of an arbitrary density–density interaction Vq between an electron-magnon state with magnon momentum q and one with momentum \({{{{{{{\bf{q}}}}}}}}^{\prime}\) is \(\propto i{{{{{{{\bf{q}}}}}}}}\wedge {{{{{{{\bf{q}}}}}}}}^{\prime} \frac{2\pi C}{{A}_{{{{{{{{\rm{BZ}}}}}}}}}}{V}_{{{{{{{{\bf{q}}}}}}}}-{{{{{{{\bf{q}}}}}}}}^{\prime} }\) at small q and \({{{{{{{\bf{q}}}}}}}}^{\prime}\), where C is the Chern number. This result can be understood by rewriting the interaction as dE where the electric field E is given by E = −qVq and the magnon dipole moment d given by \({{{{{{{\boldsymbol{d}}}}}}}}=\frac{2\pi C}{{A}_{{{{{{{{\rm{BZ}}}}}}}}}}e\wedge {{{{{{{\bf{q}}}}}}}}^{\prime}\) is a consequence of the relationship between momentum and dipole moment in a chern band15,26. Remarkably, this implies there is an attractive interaction between an electron and a spin flip for any repulsive interaction V, which takes place in the (px + ipy)-wave channel, as a consequence of the band topology.

We further investigate the conditions under which such attractive interaction leads to the formation of a bound state and apply the results to models of twisted bilayer graphene. Although the spectrum of single-particle excitations in such systems has been obtained from self-consistent Hartree–Fock studies11,12,27,28 whose results are exact in certain limits29,30, the existence of other low-lying charged excitations, e.g. skyrmions or spin-polarons, implies that a single-quasiparticle based description is insufficient to capture the physics on doping correlated insulators. Such unconventional excitations were proposed by the authors and co-workers to play a crucial role in the “skyrmion mechanism” of superconductivity20,21.

To solve this problem, we exploit the fact that the Hilbert space of an electron + a single-spin flip only scales as N2 for a system with N unit cells which allows us to solve this problem for relatively large system sizes. Our results can be summarized as follows: (i) In the limit of vanishing quasiparticle dispersion (i.e., the non-interacting dispersion + the interaction-generated dispersion), we find that the spin polaron is always lower in energy than the electron, with its energy increasing as the Berry curvature gets more concentrated. (ii) The existence of the spin polaron as a bound state is very sensitive to band topology and it is lost if we drive a phase transition to Chern trivial bands. (iii) We find that there is a critical value for the effective mass of the quasiparticle bands, beyond which the single electron is lower in energy than an electron dressed with a spin flip. In this limit, although the spin polaron does not exist as a stable bound state, it can still influence physics as a resonance. Our results serve as a bridge between momentum space single-particle excitations and real space skyrmion excitations, with the energy of the spin polarons computed here providing a strict upper bound on the energy of skyrmion excitations. Furthermore, our results show that a description in terms of single-particle excitations, even when they are “exact”, is generally incomplete to understand the physics of charge doping in a Chern-ferromagnet. In the end, we discuss the implications of these results for the phenomenology of TBG.

Results

General formalism

We consider the Hamiltonian of a density–density interaction Vq projected onto a pair of SU(2)-symmetric bands with single-particle dispersion ϵ0(k) and wavefunctions \(\left|{u}_{{{{{{{{\bf{k}}}}}}}}}\right\rangle\):

$${{{{{{{\mathcal{H}}}}}}}}=\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}},\sigma=\uparrow,\downarrow }{c}_{{{{{{{{\bf{k}}}}}}}},\sigma }^{{{{\dagger}}} }{\epsilon }_{0}({{{{{{{\bf{k}}}}}}}}){c}_{{{{{{{{\bf{k}}}}}}}},\sigma }+\frac{1}{2A}\mathop{\sum}\limits_{{{{{{{{\bf{q}}}}}}}}}{V}_{{{{{{{{\bf{q}}}}}}}}}\delta {\rho }_{{{{{{{{\bf{q}}}}}}}}}\delta {\rho }_{-{{{{{{{\bf{q}}}}}}}}},$$
(1)

where \(\delta {\rho }_{{{{{{{{\bf{q}}}}}}}}}={\rho }_{{{{{{{{\bf{q}}}}}}}}}-{\bar{\rho }}_{{{{{{{{\bf{q}}}}}}}}}\) is the projected density measured relative to a certain reference chosen such that the interacting piece of the Hamiltonian annihilates the ferromagnetic state at half-filling (see ref. 12 for details). The projected density operator is given by

$${\rho }_{{{{{{{{\bf{q}}}}}}}}}=\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}}{c}_{\sigma,{{{{{{{\bf{k}}}}}}}}}^{{{{\dagger}}} }{c}_{\sigma,{{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}}}{\lambda }_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}),\qquad {\lambda }_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})=\left\langle {u}_{{{{{{{{\bf{k}}}}}}}}}|{u}_{{{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}}}\right\rangle$$
(2)

If the bare dispersion ϵ0 is sufficiently small, the ground state of the Hamiltonian (1) is a ferromagnet \(\left|\!\downarrow \right\rangle={\prod }_{{{{{{{{\bf{k}}}}}}}}}{c}_{\downarrow,{{{{{{{\bf{k}}}}}}}}}^{{{{\dagger}}} }\left|0\right\rangle\) (annihilated by δρq for all q), with total spin \(S=\frac{N}{2}\) which we choose to have \({S}_{z}=-\frac{N}{2}\). Single-particle excitations with charge e are given by \({\left|{{{{{{{\bf{k}}}}}}}}\right\rangle }_{e}={c}_{\uparrow,{{{{{{{\bf{k}}}}}}}}}^{{{{\dagger}}} }\left|\! \downarrow\! \right\rangle\) and \({\left|{{{{{{{\bf{k}}}}}}}}\right\rangle }_{h}={c}_{\downarrow,{{{{{{{\bf{k}}}}}}}}}\left|\! \downarrow \right\rangle\), respectively. The state \({\left|{{{{{{{\bf{k}}}}}}}}\right\rangle }_{e/h}\) has \({S}_{z}=-\frac{N-1}{2}\) and total spin \(S=\frac{N-1}{2}\) and its energy is given exactly (up to an irrelevant constant) by \({{{{{{{\mathcal{H}}}}}}}}{\left|{{{{{{{\bf{k}}}}}}}}\right\rangle }_{e/h}={\epsilon }_{e/h}({{{{{{{\bf{k}}}}}}}}){\left|{{{{{{{\bf{k}}}}}}}}\right\rangle }_{e/h}\). The quasiparticle dispersion ϵe/h(k) is given by ϵe/h(k) = ± ϵ0(k) + ϵF(k) where the interaction-generated dispersion \({\epsilon }_{F}({{{{{{{\bf{k}}}}}}}})=\frac{1}{2A}{\sum }_{{{{{{{{\bf{q}}}}}}}}}{V}_{{{{{{{{\bf{q}}}}}}}}}|{\lambda }_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}){|}^{2}\) is nothing but the Fock energy, which gives rise to a nontrivial band dispersion whenever the magnitude of the form factor λq(k) is k-dependent. In the following, we will mainly focus on the electron bands and drop the e subscript.

We now consider a basis of states containing an electron and a spin flip which is obtained from the ground state ferromagnet by creating two electrons with spin up and a hole with spin down

$$\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle={c}_{\uparrow,\,{{{{{{{{\bf{k}}}}}}}}}_{e1}}^{{{{\dagger}}} }{c}_{\uparrow,\,{{{{{{{{\bf{k}}}}}}}}}_{e2}}^{{{{\dagger}}} }{c}_{\downarrow,\,{{{{{{{{\bf{k}}}}}}}}}_{h}}\left|\!\downarrow\! \right\rangle$$
(3)

The effective Hamiltonian in the two-electron/one-hole sector is defined as \({H}_{{{{{{{{{\bf{k}}}}}}}}}_{e1}^{\prime},\,{{{{{{{{\bf{k}}}}}}}}}_{e2}^{\prime},\,{{{{{{{{\bf{k}}}}}}}}}_{h}^{\prime};{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}}^{2e1h}=\left\langle {{{{{{{{\bf{k}}}}}}}}}_{e1}^{\prime},\,{{{{{{{{\bf{k}}}}}}}}}_{e2}^{\prime},\,{{{{{{{{\bf{k}}}}}}}}}_{h}^{\prime}\right|{{{{{{{\mathcal{H}}}}}}}}\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle\) whose explicit form is provided in the methods section. The Hamiltonian H2e1h acts on the state \(\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle\) by shifting two of the three momenta ke1, ke2 and kh such that the total momentum k = ke1 + ke2 − kh is conserved. Thus, we can separately diagonalize the Hamiltonian for each total momentum sector labeling the states by the two electronic momenta ke1 and ke2. Due to fermionic anticommutation relations, these only label \(\frac{N(N-1)}{2}\) distinct states for a grid with N points.

Notice that the Hilbert space spanned by the states (3) corresponds to states with definite \({S}_{z}=-\frac{N-3}{2}\) but with total spin \(S=\frac{N-3}{2}\) or \(\frac{N-1}{2}\). Since the Hamiltonian (1) conserves the total spin, we can label the eigenstates of H2e1h by \(S=\frac{N-1}{2},\frac{N-3}{2}\). The explicit form of the total spin operator in the basis (3) is provided in the methods section. Notice that the Hamiltonian H2e1h always has an eigenstate with total spin \(S=\frac{N-1}{2}\) and energy ϵ(k) generated by acting with the spin raising operator, which commutes with \({{{{{{{\mathcal{H}}}}}}}}\), on the single-particle excitation \(\left|{{{{{{{\bf{k}}}}}}}}\right\rangle\).

Our goal is to understand the energy competition between single-particle excitations (\(S=\frac{N-1}{2}\)) and those dressed by a spin flip (\(S=\frac{N-3}{2}\)). If the latter is lower in energy, this indicates the existence of a bound state of an electron and spin flip, a spin polaron. So far, our discussion has been very general. Our model has as inputs the interaction Vq, the bare dispersion ϵ0(k) and the form factors λq(k). It is worth mentioning that usually, ϵ0(k) and λq(k) are generated from the same microscopic model, so they are not always independent. However, we will choose to think of them as independent inputs, which allows us to take into account the effect of remote bands on dispersion. To study the competition between single-particle excitations and spin-dressed excitations, we use a class of continuum models that continuously interpolates between the LLL and the narrow Chern bands of TBG. This class of models provides an excellent playground to study the formation of spin polarons by allowing independent tuning of bandwidth, band topology, and band geometry.

Twisted bilayer graphene bands

Before discussing our results, it is instructive to briefly review the continuum model for twisted bilayer graphene (TBG)31,32,33, which consists of two Dirac Hamiltonians coupled through a Moiré potential. The latter has a matrix structure in the sublattice space and can be parametrized by two hopping parameters for intra- and inter-sublattice tunneling, denoted by wAA and wAB whose ratio has been estimated to be around \(\kappa=\frac{{w}_{{{{{{{{\rm{AA}}}}}}}}}}{{w}_{{{{{{{{\rm{AB}}}}}}}}}}\,\approx\, 0.5{-}0.8\)34,35,36. A particularly interesting limit called the chiral limit corresponds to the case κ = 09. The wavefunctions of the model in this limit are sublattice-polarized with Chern number ±1 and has been shown to be equivalent to the lowest Landau level of a Dirac particle in an inhomogeneous periodic magnetic field37 providing a direct relation between this model and Landau level physics. Even away from the κ = 0 limits, we can define such a sublattice-polarized basis where the bands have well-defined Chern numbers12. This leads to a total of 4 + 4 bands with Chern numbers +1 and −1.

The results of this work apply to TBG under two assumptions. First, we employ the independent Chern sector approximation by neglecting the coupling between Chern sectors. This approximation retains the U(4) symmetry rotating the bands within each Chern sector by ignoring inter-Chern dispersion and inter-Chern wavefunction overlaps. These are both relatively small perturbations of the U(4) × U(4) model that has comparable magnitude12. This approximation implies separate spin conservation in each sector that allows us to sharply distinguish the electron from the polaron (otherwise, the two can, in principle, tunnel into each other). Second, we assume only two active bands with SU(2) spin rotation. The remaining bands are assumed to be completely filled or empty and only influence the problem by changing the dispersion ϵ0(k) through Hartree corrections, as we will discuss later. In the end, we will discuss the validity of these assumptions.

Flat quasiparticle dispersion

We will start by focusing on the limit of flat quasiparticle dispersion where the single-particle dispersion and the interaction-generated dispersion exactly cancel for the electron band, ϵ0(k) = −ϵF(k). It is important to emphasize here that this limit is distinct from the flat band limit where the bare dispersion vanishes ϵ0(k) = 0 that occurs for the chiral model at the magic-angle, assuming there are no other sources of dispersion. In particular, we will show later that this limit is realized to an excellent approximation for the electron (hole) band at ν = −1 (ν = +1), where the Hartree contribution from the filled bands gives rise to a single-particle term ϵ0(k) which almost exactly cancels the interaction-generated Fock dispersion ϵF(k). Another motivation for starting with this limit is that it allows us to isolate the effects of band geometry and topology from those of the dispersion, which will be added later. In general, we will define an effective momentum space magnetic field \({{{{{{{\mathcal{B}}}}}}}}=\frac{2\pi C}{{A}_{{{{{{{{\rm{BZ}}}}}}}}}}\) whose momentum space integral gives 2πC. We define the corresponding magnetic length as \({l}_{B}=\sqrt{\frac{|{{{{{{{\mathcal{B}}}}}}}}|}{2\pi }}\). Unless otherwise stated, we will be using the parameters for TBG at the magic-angle θ = 1.0595o with chiral ratio κ = 0.55 and unscreened Coulomb interaction \({V}_{{{{{{{{\bf{q}}}}}}}}}=\frac{{e}^{2}}{2\epsilon {\epsilon }_{0}|{{{{{{{\bf{q}}}}}}}}|}\) with ϵ = 10.

From LLL to TBG Chern bands

Let us start with the simplest possible Chern band, the lowest Landau level. To compare this model to Chern bands defined in momentum space, we define a real space unit cell that contains a single flux quantum so that \({A}_{U}=\frac{2\pi }{B}\) where B is the real space magnetic field which is equal to the inverse of the momentum space magnetic field \({{{{{{{\mathcal{B}}}}}}}}\) since \(B=\frac{2\pi }{{A}_{U}}=\frac{{A}_{{{{{{{{\rm{BZ}}}}}}}}}}{2\pi }={{{{{{{{\mathcal{B}}}}}}}}}^{-1}\). We can connect this LLL limit to chiral TBG using the results of ref. 37 which showed that the wavefunctions of the latter are equivalent to the LLL of a Dirac particle in the inhomogenous magnetic field Beff(r) = B + B(r), for some specific B(r) which averages to zero over the unit cell. We now consider a Dirac particle in magnetic field Beff(r) = B + ηB(r) where η goes from 0 to 1, interpolating between the LLL and chiral TBG. We define ΔE to be the energy of the lowest \(S=\frac{N-3}{2}\) eigenvalue of H2e1h relative to the energy of the lowest \(S=\frac{N-1}{2}\). Note that in the thermodynamic limit, there is a continuum of \(S=\frac{N-3}{2}\) states lying directly above the single particle \(S=\frac{N-1}{2}\) state which implies that ΔE ≤ 0. Numerically, we expect to get a positive value for ΔE which scales to 0 with increasing system size whenever a bound state is absent. Throughout this work, we will set ΔE to zero in these cases. ΔE is shown in Fig. 1a as a function of η. We can see clearly that its value is negative for all η indicating the formation of a bound state of an electron and a spin flip with binding energy around −1.5 meV for our choice of parameters. Although changing η introduces variations in the Berry curvature distribution, the energy of the bound state is essentially independent of η.

Fig. 1: Binding energy of the polaron-bound state for flat quasiparticle dispersion.
figure 1

The plot of the binding energy for flat quasiparticle dispersion (a) as we extrapolate between the LLL in the uniform field and chiral TBG wavefunctions, b as a function of the chiral ratio κ, c as we tune the Chern number of the band by changing the bottom layer sublattice potential δbottom for fixed top layer sublattice potential δtop = 10 meV, and d as a function of the screening gate distance.

Tuning the chiral ratio

Next, we introduce deviations from the chiral limit by considering the non-zero value for the chiral ratio κ. Finite κ is known to alter the geometric properties of the bands and cause the Berry curvature to be concentrated at Γ28,37. For κ 0.7−0.8, the Berry curvature is very close to a delta function at Γ, which can be gauged away38 leading to the loss of the band’s topological character. As we can see in Fig. 1b, the bound state persists for all values of κ 0.8, but its binding energy starts to approach zero as we approach the limit of very concentrated Berry curvature, hinting at its topological origin.

Sublattice potential

We can see the effect of topology more manifestly by considering a tuning parameter which alters the band topology by inducing a phase transition to a trivial Chern band. This is done by adding layer-dependent sublattice potential δtop/bottom, which can be physically realized from aligned hBN5,39. As was shown in ref. 40, the sublattice-polarized bands have vanishing Chern number when δtop + δbottom is close to 0, and finite Chern number ±1 otherwise. In Fig. 1c, we show ΔE as a function of δbottom for fixed δtop = 10 meV. We see that ΔE remains roughly constant on the topological side until we approach the transition, where it rapidly increases till it vanishes on the non-topological side.

Gate screening

So far, we have been considering unscreened Coulomb interaction relevant for TBG samples where the distance to the gate is much larger than the Moiré length scale. We will now consider the effect of gate screening by taking Vq to be double-gate screened Coulomb interaction \({V}_{{{{{{{{\bf{q}}}}}}}}}(d)=\frac{{e}^{2}}{2\epsilon {\epsilon }_{0}|{{{{{{{\bf{q}}}}}}}}|}\tanh qd\) with d denoting the gate distance. Since changing d also alters the overall energy scale, we will find it more useful to measure energy in terms of the scale \({E}_{C}(d)=\frac{1}{2A}{\sum }_{{{{{{{{\bf{q}}}}}}}}}{V}_{{{{{{{{\bf{q}}}}}}}}}(d){e}^{-\frac{|B|}{2}{{{{{{{{\bf{q}}}}}}}}}^{2}}\) which reduces to half the particle-hole gap for the LLL. The polaron binding energy is plotted as a function of d in Fig. 1d which shows how ΔE starts decreasing when the gate distance is around 10 nm (roughly the Moiré scale) until it vanishes in the limit d → 0. This is consistent with what is known about skyrmion energies which approaches the single-particle energy as the screening length is reduced. In fact, it was shown in ref. 22 that for the LLL, the energy of skyrmions of any size is the same as that of single-particle excitations for a delta potential, i.e., d → 0.

Topological electron-magnon coupling

To understand the existence of a bound state of an electron and a spin flip, it is instructive to rewrite the Hamiltonian H2e1h by labeling the Hilbert space of \({S}_{z}=-\frac{N-3}{2}\) in terms of an electron and a magnon excitation. The latter corresponds to the eigenmodes of the Hamiltonian in the space of single-spin flip operators

$${a}_{n,{{{{{{{\bf{q}}}}}}}}}^{{{{\dagger}}} }=\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}}{c}_{{{{{{{{\bf{k}}}}}}}}, \uparrow }^{{{{\dagger}}} }{c}_{{{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}},\,\downarrow\! }{\phi }_{n,{{{{{{{\bf{q}}}}}}}}}\left({{{{{{{\bf{k}}}}}}}}\right),\quad {{{{{{{\mathcal{H}}}}}}}}{a}_{n,{{{{{{{\bf{q}}}}}}}}}^{{{{\dagger}}} }\left|\! \downarrow \right\rangle={\xi }_{n,{{{{{{{\bf{q}}}}}}}}}{a}_{n,{{{{{{{\bf{q}}}}}}}}}^{{{{\dagger}}} }\left|\! \downarrow \right\rangle$$
(4)

where q belongs to the first BZ. The operators \({a}_{n,{{{{{{{\bf{q}}}}}}}}}^{{{{\dagger}}} }\) provide a complete N-dimensional orthonormal basis for spin flip operators, which can be used to represent any spin flip operator \({c}_{{{{{{{{\bf{k}}}}}}}}, \uparrow }^{{{{\dagger}}} }{c}_{{{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}},\,\!\downarrow\! }\). The lowest energy state n = 0 corresponds to the Goldstone mode of the broken SU(2) spin symmetry whose dispersion satisfies ξ0,q → 0 as q → 0.

Using this basis, we can construct a non-orthogonal basis of electron-magnon states as \(\left|{{{{{{{{\bf{k}}}}}}}}}_{0};{{{{{{{\bf{q}}}}}}}},\,\,n\right\rangle={c}_{{{{{{{{{\bf{k}}}}}}}}}_{0} \!{+} {{{{{{{\bf{q}}}}}}}},\, \uparrow }^{{{{\dagger}}} }{a}_{n,{{{{{{{\bf{q}}}}}}}}}^{{{{\dagger}}} }\left|\! \downarrow \right\rangle\) whose properties are discussed in detail in the methods section. In this basis, the Hamiltonian has the form

$${{{{{{{\mathcal{H}}}}}}}}\left|{{{{{{{{\bf{k}}}}}}}}}_{0};{{{{{{{\bf{q}}}}}}}},\,n\right\rangle= \, \left[{\xi }_{n,{{{{{{{\bf{q}}}}}}}}}+\epsilon ({{{{{{{{\bf{k}}}}}}}}}_{0}+{{{{{{{\bf{q}}}}}}}})\right]\left|{{{{{{{{\bf{k}}}}}}}}}_{0};{{{{{{{\bf{q}}}}}}}},\,\,n\right\rangle \\ +\frac{1}{A}\mathop{\sum}\limits_{{{{{{{{\bf{q}}}}}}}}^{\prime} }{V}_{{{{{{{{\bf{q}}}}}}}}^{\prime} }{\lambda }_{{{{{{{{\bf{q}}}}}}}}^{\prime} }^{*}({{{{{{{{\bf{k}}}}}}}}}_{0}+{{{{{{{\bf{q}}}}}}}}){C}_{{{{{{{{\bf{q}}}}}}}},\,{{{{{{{\bf{q}}}}}}}}^{\prime} }^{nm}\left|{{{{{{{{\bf{k}}}}}}}}}_{0};{{{{{{{\bf{q}}}}}}}}+{{{{{{{\bf{q}}}}}}}}^{\prime},\,m\right\rangle$$
(5)

where \({C}_{{{{{{{{\bf{q}}}}}}}},\,{{{{{{{\bf{q}}}}}}}}^{\prime} }^{nm}\) are defined as

$${C}_{{{{{{{{\bf{q}}}}}}}},\,{{{{{{{\bf{q}}}}}}}}^{\prime} }^{nm}=\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}}{\phi }_{m,{{{{{{{\bf{q}}}}}}}}+{{{{{{{\bf{q}}}}}}}}^{\prime} }^{*}({{{{{{{\bf{k}}}}}}}}) \left[{\lambda }_{{{{{{{{\bf{q}}}}}}}}^{\prime} }\left({{{{{{{\bf{k}}}}}}}}\right){\phi }_{n,{{{{{{{\bf{q}}}}}}}}}\left({{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}}^{\prime} \right)-{\phi }_{n,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}){\lambda }_{{{{{{{{\bf{q}}}}}}}}^{\prime} }\,({{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}})\right]$$
(6)

The meaning of the different terms in the Hamiltonian above is transparent. The first two terms correspond to the magnon and the electron dispersion, respectively. The last term corresponds to the matrix elements of the interaction between electron-magnon states with magnon momenta q and \({{{{{{{\bf{q}}}}}}}}^{\prime}\). Low-lying excitations has their largest wieght in the lowest magnon branch n = 0. If we focus on this branch n = m = 0 and take the limit of small q and \({{{{{{{\bf{q}}}}}}}}^{\prime}\), we find

$${C}_{{{{{{{{\bf{q}}}}}}}},\,{{{{{{{\bf{q}}}}}}}}^{\prime} }^{00}\,\approx\, i{{{{{{{\bf{q}}}}}}}}\wedge {{{{{{{\bf{q}}}}}}}}^{\prime} B,\quad B=\frac{2\pi C}{{A}_{{{{{{{{\rm{BZ}}}}}}}}}}$$
(7)

To see where this expression comes from, it is instructive to first consider the case of the LLL. One simplification we can do here is to unfold the BZ by extending q beyond the first BZ and removing the index n. The coefficient \({C}_{{{{{{{{\bf{q}}}}}}}},\,{{{{{{{\bf{q}}}}}}}}^{\prime} }\) is then precisely the coefficient of the commutator of the GMP algebra \([{\rho }_{{{{{{{{\bf{q}}}}}}}}},{a}_{{{{{{{{\bf{q}}}}}}}}^{\prime} }^{{{{\dagger}}} }]={C}_{{{{{{{{\bf{q}}}}}}}},\,{{{{{{{\bf{q}}}}}}}}^{\prime} }{a}_{{{{{{{{\bf{q}}}}}}}}+{{{{{{{\bf{q}}}}}}}}^{\prime} }^{{{{\dagger}}} }\)41, which in this case is equal to \(2i\sin \left(\frac{B}{2}{{{{{{{\bf{q}}}}}}}}\wedge {{{{{{{\bf{q}}}}}}}}^{\prime} \right)\). For a general Chern band, the GMP algebra holds to linear order in q and \({{{{{{{\bf{q}}}}}}}}^{\prime}\)42 with the prefactor given by \(iB{{{{{{{\bf{q}}}}}}}}\wedge {{{{{{{\bf{q}}}}}}}}^{\prime}\) which is precisely what we get in Eq. (7). A more detailed derivation of this result is given in the methods section.

Let us write a general eigenstate of (5) as

$$\left|{{\Psi }}\right\rangle=\mathop{\sum}\limits_{n,{{{{{{{\bf{q}}}}}}}}}{r}_{n,{{{{{{{\bf{q}}}}}}}}}\left|{{{{{{{{\bf{k}}}}}}}}}_{0};{{{{{{{\bf{q}}}}}}}},\,n\right\rangle$$
(8)

Focusing on the n = 0 component in the small q limit, we see from (7) that the magnitude of the last term in the Hamiltonian (5) is maximized when connecting states \(\left|{{{{{{{{\bf{k}}}}}}}}}_{0};{{{{{{{\bf{q}}}}}}}},\,n\right\rangle\) and \(\left|{{{{{{{{\bf{k}}}}}}}}}_{0};{{{{{{{\bf{q}}}}}}}}^{\prime},\,n\right\rangle\) with q and \({{{{{{{\bf{q}}}}}}}}^{\prime}\) orthogonal, i.e., if they are related by a π/2 rotation. Furthermore, due to the factor of i in (7), we can make this term negative by choosing r0,q to change its phase by π/2 upon rotating q by π/2. Thus, we can minimize this term by choosing \({r}_{0,{{{{{{{\bf{q}}}}}}}}}\propto {e}^{i\arg ({q}_{x} \!+i{q}_{y})}\). In addition, since this term vanishes when q or \({{{{{{{\bf{q}}}}}}}}^{\prime}\) vanish, the magnitude of r0,q should not vanish too quickly with q. We can see this by expanding the first two terms in (5) at small momenta \({\xi }_{0,{{{{{{{\bf{q}}}}}}}}} \sim {l}_{B}^{2}\rho {{{{{{{{\bf{q}}}}}}}}}^{2}\) and \(\epsilon ({{{{{{{{\bf{k}}}}}}}}}_{0}+{{{{{{{\bf{q}}}}}}}}) \sim \frac{{l}_{B}^{2}}{{m}_{{{{{{{{\rm{eff}}}}}}}}}}{{{{{{{{\bf{q}}}}}}}}}^{2}\). Then, if we assume that r0,q decays for momenta larger than some cutoff Λ, we find that the first two terms in the Hamiltonian give a positive energy contribution of order \({l}_{B}^{2}(\rho+{m}_{{{{{{{{\rm{eff}}}}}}}}}^{-1}){{{\Lambda }}}^{2}\) whereas the last term gives a negative contribution of order \({E}_{C}{l}_{B}^{3}{{{\Lambda }}}^{3}\). Thus, a bound state has to have a finite extent in momentum of at least \({{\Lambda }} \sim \frac{\rho+{m}^{-1}}{{E}_{C}{l}_{B}}\). This is verified by plotting r0,q for both the LLL and chiral TBG in Fig. 2. We see that r0,q decays in q within the first BZ and that \(\arg {r}_{0,{{{{{{{\bf{q}}}}}}}}}\) winds by 2π around the Γ point. For the LLL case, due to continuous magnetic translation, we can unfold the BZ and write the variational state \({r}_{{{{{{{{\bf{q}}}}}}}}}={e}^{-\frac{\xi }{2}|{{{{{{{\bf{q}}}}}}}} |+i\arg ({q}_{x}+i{q}_{y})}\) whose overlap with the numerically obtained solution exceeds 99% for appropriately chosen ξ (see supplemental material for details).

Fig. 2: Polaron wavefunctions.
figure 2

Color plot of the polaron wavefunction as a function of the magnon momentum q (cf. Eq. (8)) for a the LLL, TBG bands (b) with flat quasiparticle dispersion, and c for the electron (hole) doping the ν = −3 (ν = +3) insulator.

Finite quasiparticle dispersion

Let us now consider the limit of dispersive quasiparticle bands. Motivated by the energetics of TBG bands, we will choose ϵ0(k) = νϵH(k) where \({\epsilon }_{H}({{{{{{{\bf{k}}}}}}}})=\frac{1}{A}{\sum }_{{{{{{{{\bf{G}}}}}}}}}{V}_{{{{{{{{\bf{G}}}}}}}}}{\lambda }_{{{{{{{{\bf{G}}}}}}}}}({{{{{{{\bf{k}}}}}}}}){\sum }_{{{{{{{{\bf{k}}}}}}}}^{\prime} }\,{\lambda }_{-{{{{{{{\bf{G}}}}}}}}}({{{{{{{\bf{k}}}}}}}}^{\prime} )\) is the Hartree potential11,28,30,43,44,45. This form of the dispersion directly allows us to compare our results to TBG since the quasiparticle dispersion ϵν(k) = νϵH(k) + ϵF(k) describes the dispersion of electron (hole) bands for a correlated insulator at integer filling ν (−ν) in the independent Chern sector approximation12,44. Although the expression for ϵν(k) is only valid for integer ν, we will choose to take ν to be a continuous variable which allows us to continuously tune the band dispersion. It also makes our results less sensitive to uncertainty in model parameters. For instance, while our model is particle-hole symmetric, we can phenomenologically incorporate particle-hole asymmetry by shifting ν, which approximately captures the effects discussed in ref. 46. ϵH(k) is characterized by a dip at the Γ point and thus have a qualitatively similar shape to the Fock term ϵF(k) up to overall scaling. Thus for ν > 0, the two terms add, leading to a large bandwidth, while for ν < 0, they subtract, leading to a reduced bandwidth11,30,44,45. The minimum bandwidth is realized for ν ≈ −1 to −1.5 depending on the value of κ as shown by the dashed line in Fig. 3a. The details of the dispersion are reviewed in the methods section.

Fig. 3: Polaron energetics in twisted bilayer graphene.
figure 3

a Binding energy as a function of “filling” ν and chiral ratio κ. The dashed lines indicate integer fillings ν where our dispersion matches the HF dispersion for electron doping. A bound state is only found for ν < 0, indicating doping towards neutrality. The solid black line indicates the boundary where the bound state is lost, the dashed gray line corresponds to the minimum bandwidth, and the gray shaded area is where bands overlap and our analysis becomes invalid. b The corresponding inverse effective mass at the bottom of the band. The phase boundary is well approximated by the value me/meff ≈ 3. c, d are the spectra for electron doping the ν = −1 and ν = −3 insulators, respectively.

We can investigate the existence of a bound state as a function of dispersion parameterized by ν. We find that there is a critical value νc, indicated by the solid line in Fig. 3a, such that a bound state exists iff ν < νc. We see that this value is always negative and ranges from around −0.5 in the chiral limit to around −1.2 at κ = 0.7. The implication for TBG is that, generally, polaron formation is favored on electron (hole) doping the ν < 0 (ν > 0) insulators, i.e., doping towards neutrality. On doping away from neutrality, we always find the single-particle excitations to be the lower in energy. We note that for ν sufficiently large and negative, the Hartree term dominates and we get a peak rather than a dip at Γ. In this case, we always find a polaron-bound state even though the bandwidth can be quite large. This rather surprising results can be explained by examining the wavefunction of the Γ polaron in Fig. 2c. We see that, compared to the case of flat quasiparticle dispersion, the wavefunction has suppressed weight at small q enabling it to avoid the energetically costly region around Γ.

This suggests that the formation of a bound state is mostly sensitive to the effective mass at the bottom of the band, which is relatively large at the band minimum for ν large and negative, rather than the total bandwidth. The effective mass has the added advantage of being an experimentally accessible quantity, e.g., from quantum oscillations1,2, allowing us to make phenomenological comparisons with experiments that are not tied to theory parameters. In Fig. 3b, we show the effective mass for different values (ν, κ) and compare it to the phase diagram in Fig. 3a. We find that, quite remarkably, the phase boundary where the polaron-bound state is lost can be described very well by the expression me/meff ≈ 3 as shown in Fig. 3b. This value is not far off the experimentally extracted value meff/me ≈ 0.2−0.3 from quantum oscillations for small hole doping of the ν = −2 state where superconductivity was first seen1. This suggests that the experimental regime for superconductivity can be close to the phase boundary where a bound state would form even when the lowest charged excitation are single-particle-like.

Finally, we investigate the dispersion of the spin polaron state in the limit when it is the lowest energy excitation. We consider two cases: (i) electron (hole) doping the ν = −1 (ν = 1) insulator where the dispersion minimum is at Γ and (ii) electron (hole) doping the ν = −3 (ν = +3) insulator where the dispersion maximum is at Γ. The resulting dispersion is shown in Fig. 3c, d. We see that the polarons have much flatter bands with significantly larger effective mass that can be as large as 30 times the electron’s effective mass.

Discussion

Implications for TBG

Before discussing potential implications for our results, let us point out a few caveats. First, the lowest energy charged excitations we obtain here are just among single particle states and those dressed by a single-spin flip. This does not rule out the possibility that states with more spin flips are lower energy excitations (which is known to be the case for the LLL14,15) even in the cases the spin polaron is not lower in energy than single particle excitations. This means that our results establish a parameter regime where the electron is not the lowest energy charge e excitation but cannot make a strong statement about the precise number of spin flips or the nature of charge e excitations outside that regime. We note, however, that once we are in the domain of stability of polarons, we expect significant modification of the physics compared to single electrons, even if we cannot determine precisely the number of of bound spin flips. In other words, we believe that once spin polarons are formed, no matter the precise size, their properties are well approximated by the simplest nontrivial one studied here e.g., they will have very flat dispersion. We note here that by applying a Zeeman field, the energy of a spin polaron with n spin flips is increased by \({{\Delta }}{E}_{{S}_{z}=(N-1)/2-n}=(n+1/2){E}_{{{{{{{{\rm{Zeeman}}}}}}}}}\). This disfavors larger polarons/skyrmions and it can be chosen such that the single-spin flip polaron n = 1 is the lowest energy excitation if it is already lower in energy than the electron at zero fields provided that the energy as a function of the number of spin n flips is a convex function. This means that the energy difference between the single-spin flip polaron and the electron exceeds the difference between the polaron with n + 1 spin flips and that with n spin flips for any n > 0. EZeeman can be realized via in-plane magnetic field if these are spin skyrmions or via sublattice potential if these are pseudospin skyrmions.

Second, our analysis focused on a single Chern sector. While we took into account Hartree–Fock interaction-generated dispersion, which is mainly diagonal in the Chern index, we have neglected the part of the interaction and the dispersion connecting different Chern sectors. The inter-Chern part of the interaction vanishes in the chiral limit and is otherwise a relatively small correction that gives rise to an inter-Chern pseudospin coupling λ ≈ 0.4 − 0.6, which is antiferromagnetic in-plane and ferromagnetic out-of-plane12,20,38. This gives rise to a Zeeman term with EZeeman = λ which only affects pseudospin polarons. The inter-Chern part of the dispersion comes from the intrinsic dispersion of the BM model as well as the subtraction scheme employed when projecting out the remote bands to avoid double counting (see refs. 11, 12, 28, 47). It takes the form of tunneling between opposite Chern sectors, which influences physics in several ways. First, the electron and polaron are no longer distinguished by their spin quantum number since the spin is no longer conserved in a given Chern sector, and thus, they can tunnel into each other. Second, such tunneling perturbatively generates an antiferromagnetic “superexchange” coupling J ≈ 0.5 −1 meV12,20,38 between the Chern sectors, which alters the energetics of magnons and plays a crucial role in the skyrmion pairing scenario20,21. We expect these terms to favor charge 2e polaron pairs (bipolarons) over single charge e polarons since they act as a Zeeman field EZeeman = J for spin or pseudospin polarons but do not affect the energy of polaron pairs. We leave a detailed analysis of the effect of these terms in future works. Finally, we have only focused on SU(2) polarons which assume there are only two active bands while the remaining bands are frozen. This approach can be phenomenologically justified by the observation of flavor polarization ‘cascade’ features at relatively high temperatures48,49 suggesting these polarized flavor degrees of freedom becomes frozen at low temperature. However, based on energetics alone, we cannot role out more complicated SU(4) skyrmion/polaron textures.

Our findings suggest the following picture for charge excitations in TBG: (i) On doping a correlated insulator at integer ν ≠ 0 towards charge neutrality, charge enters the system as polarons or large skyrmions. This is consistent with the observed absence of quantum oscillations for this doping range1,2,50 and also explains the rapid loss of flavor (spin/valley) polarization (cascade transition)48,49 with doping. Combined with the observations that polarons are disfavored by reducing the screening length, this leads to the prediction that the cascade features should become weaker as the screening length to the gate is reduced4,51. (ii) On doping away from neutrality, charge likely enters the system as single-particle excitations consistent with the observation of Landau fans. Finally, Although we have focused on charge e excitations, let us make a few observations about pairing, i.e. the charge 2e excitations. Pairing between stable spin-polarons, the analog of the skyrmion pairing mechanism proposed by the authors in ref. 20 (see also refs. 21, 52) can be naturally associated with doping towards neutrality where superconductivity has been observed in some samples1,3,4. On the other hand, on doping away from neutrality (where superconductivity is seemingly more ubiquitous), spin polarons can remain relevant as finite energy long-lived excitations whose pairing correlations can be induced to the electrons, even when they are not the lowest charge excitations. This suggests a BEC-to-BCS scenario with increasing dispersion where the bound state is lost while superconductivity persists53. A detailed theory of spin-polaron pairing will be the topic of future work.

In summary, we have identified a general tendency for the formation of a polaron-bound state between an electron and a spin flip in a Chern band that is purely topological in origin. We have studied the formation of such bound states over a wide range of parameters for the Chern bands of twisted bilayer graphene. This lead us to identify the experimental parameter range where spin polarons are formed and discuss their possible experimental consequences. Our results highlight the surprising fact that although the ground state is well approximated by a Slater determinant, a description in terms of electron-like single-particle excitations, whether approximate or exact, is insufficient to describe the charge physics in Chern bands. Furthermore, our analysis serves as a bridge between real space skyrmion textures and single-particle excitations.

Note

We would like to point out a parallel work54 which gives an extensive report of the energetics of TBG skyrmions, both charge “e” and charge “2e”, using variational Hartree–Fock. The results of that work, which is suited to study the limit of large skyrmions, is complementary to our momentum space approach. In addition, After the appearance of our work on arXiv, ref. 55 appeared, which also studied spin polarons in the specific context of TBG and focused on the limit of short screening gate distance. In the chiral limit, their calculations agree with ours for the same parameters (after accounting for a difference in the notation for screening length, which is defined in ref. 55 as twice the distance between gate and sample). Away from the chiral limit, our work, and ref. 55 adopt different approximations.

Methods

Explicit form of the Hamiltonian and total spin operators

The explicit form of the Hamiltonian \({{{{{{{{\mathcal{H}}}}}}}}}^{2e1h}\) can be easily obtained from the action of the Hamiltonian \({{{{{{{\mathcal{H}}}}}}}}\), Eq. (1), on \(\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle\), defined in Eq. (3), using the commutation relations

$$\left[\delta {\rho }_{{{{{{{{\bf{q}}}}}}}}},\,{c}_{\sigma,{{{{{{{\bf{k}}}}}}}}}^{{{{\dagger}}} }\right]={\lambda }_{-{{{{{{{\bf{q}}}}}}}}}^{*}({{{{{{{\bf{k}}}}}}}}){c}_{\sigma,{{{{{{{\bf{k}}}}}}}}-{{{{{{{\bf{q}}}}}}}}}^{{{{\dagger}}} },\quad \left[\delta {\rho }_{{{{{{{{\bf{q}}}}}}}}},\,{c}_{\sigma,{{{{{{{\bf{k}}}}}}}}}\right]=-{\lambda }_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}){c}_{\sigma,{{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}}}$$
(9)

leading to

$${{{{{{{\mathcal{H}}}}}}}}\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle= \Big[{\epsilon }_{e}({{{{{{{{\bf{k}}}}}}}}}_{e1})+{\epsilon }_{e}({{{{{{{{\bf{k}}}}}}}}}_{e2})+{\epsilon }_{h}({{{{{{{{\bf{k}}}}}}}}}_{h})\Big]\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle \\ +\frac{1}{A}\mathop{\sum}\limits_{{{{{{{{\bf{q}}}}}}}}}{V}_{{{{{{{{\bf{q}}}}}}}}}\Big\{{\lambda }_{{{{{{{{\bf{q}}}}}}}}}^{*}({{{{{{{{\bf{k}}}}}}}}}_{e1}){\lambda }_{-{{{{{{{\bf{q}}}}}}}}}^{*}({{{{{{{{\bf{k}}}}}}}}}_{e2})\left|{{{{{{{{\bf{k}}}}}}}}}_{e1}+{{{{{{{\bf{q}}}}}}}},\,\,{{{{{{{{\bf{k}}}}}}}}}_{e2}-{{{{{{{\bf{q}}}}}}}},\,\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle \\ -{\lambda }_{{{{{{{{\bf{q}}}}}}}}}^{*}({{{{{{{{\bf{k}}}}}}}}}_{e1}){\lambda }_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{{\bf{k}}}}}}}}}_{h})\left|{{{{{{{{\bf{k}}}}}}}}}_{e1}+{{{{{{{\bf{q}}}}}}}},\,\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}+{{{{{{{\bf{q}}}}}}}}\right\rangle \\ \, -{\lambda }_{{{{{{{{\bf{q}}}}}}}}}^{*}({{{{{{{{\bf{k}}}}}}}}}_{e2}){\lambda }_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{{\bf{k}}}}}}}}}_{h})\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2}+{{{{{{{\bf{q}}}}}}}},\,\,{{{{{{{{\bf{k}}}}}}}}}_{h}+{{{{{{{\bf{q}}}}}}}}\right\rangle \Big\}$$
(10)

To express the total spin operator in the basis \(\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle\), we start the standard expression

$${S}^{2}={S}_{x}^{2}+{S}_{y}^{2}+{S}_{z}^{2},$$
(11)
$${S}_{x}=\frac{1}{2}\mathop{\sum}\limits_{{{{\bf{k}}}}}\left({c}_{{{{\bf{k}}}}, \uparrow }^{{\dagger}}{c}_{{{{\bf{k}}}}, \downarrow }+{c}_{{{{\bf{k}}}}, \downarrow }^{{\dagger}}{c}_{{{{\bf{k}}}}, \uparrow }\right),$$
(12)
$${S}_{y}=-\frac{i}{2}\mathop{\sum}\limits_{{{{\bf{k}}}}}\left({c}_{{{{\bf{k}}}}, \uparrow }^{{\dagger}}{c}_{{{{\bf{k}}}}, \downarrow }-{c}_{{{{\bf{k}}}}, \downarrow }^{{\dagger}}{c}_{{{{\bf{k}}}}, \uparrow }\right),$$
(13)
$${S}_{z}=\frac{1}{2}\mathop{\sum}\limits_{{{\bf{k}}}}\left({c}_{{{{\bf{k}}}}, \uparrow }^{{\dagger}}{c}_{{{{\bf{k}}}}, \uparrow }-{c}_{{{{\bf{k}}}}, \downarrow }^{{\dagger}}{c}_{{{{\bf{k}}}}, \downarrow }\right)$$
(14)

The state \(\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle={c}_{{{{{{{{{\bf{k}}}}}}}}}_{e1}, \uparrow }^{{{{\dagger}}} }{c}_{{{{{{{{{\bf{k}}}}}}}}}_{e2}, \uparrow }^{{{{\dagger}}} }{c}_{{{{{{{{{\bf{k}}}}}}}}}_{h},\!\!\downarrow\! }\left|\!\!\downarrow\! \right\rangle\) is an Sz eigenstate with eigenvalue \(-\frac{N-3}{2}\). The action of Sx and Sy can be obtained using the commutations relations

$$\left[{S}_{x},\,{c}_{{{{\bf{k}}}}, \uparrow }^{{\dagger}}\right]=\frac{1}{2}{c}_{{{{\bf{k}}}}, \downarrow }^{{\dagger}}, \qquad \left[{c}_{{{{\bf{k}}}}, \downarrow }^{{\dagger}},\,{S}_{x}\right]=-\frac{1}{2}{c}_{{{{\bf{k}}}}, \uparrow }^{{\dagger}},$$
(15)
$$\left[{S}_{x},\,{c}_{{{{{{{{\bf{k}}}}}}}}, \downarrow }\right]=-\frac{1}{2}{c}_{{{{{{{{\bf{k}}}}}}}}, \uparrow },\qquad \left[{c}_{{{{{{{{\bf{k}}}}}}}}, \uparrow },\,{S}_{x}\right]=\frac{1}{2}{c}_{{{{{{{{\bf{k}}}}}}}}, \downarrow }$$
(16)
$$\left[{S}_{y},\,{c}_{{{{{{{{\bf{k}}}}}}}},\uparrow }^{{{{\dagger}}} }\right]=\frac{i}{2}{c}_{{{{{{{{\bf{k}}}}}}}},\downarrow }^{{{{\dagger}}} }, \qquad \left[{c}_{{{{{{{{\bf{k}}}}}}}}, \downarrow }^{{{{\dagger}}} },\,{S}_{y}\right]=\frac{i}{2}{c}_{{{{{{{{\bf{k}}}}}}}}, \uparrow }^{{{{\dagger}}} },$$
(17)
$$\left[{S}_{y},\,{c}_{{{{{{{{\bf{k}}}}}}}}, \downarrow }\right]=-\frac{i}{2}{c}_{{{{{{{{\bf{k}}}}}}}}, \uparrow }, \qquad \left[{c}_{{{{{{{{\bf{k}}}}}}}}, \uparrow },\,{S}_{y}\right]=-\frac{i}{2}{c}_{{{{{{{{\bf{k}}}}}}}}, \downarrow }$$
(18)

which leads after straightforward but tedious calculations to

$${S}_{x}^{2}\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle= \frac{1}{4}\Big[-3\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle \\ +2\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}}\left(-{\delta }_{{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{h}}\left|{{{{{{{{\bf{k}}}}}}}}}_{e2},\, {{{{{{{\bf{k}}}}}}}},\, {{{{{{{\bf{k}}}}}}}}\right\rangle+{\delta }_{{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}}\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\, {{{{{{{\bf{k}}}}}}}},\, {{{{{{{\bf{k}}}}}}}}\right\rangle \right)\Big] \\ +{c}_{{{{{{{{{\bf{k}}}}}}}}}_{e1},\uparrow }^{{{{\dagger}}} }{c}_{{{{{{{{{\bf{k}}}}}}}}}_{e2},\uparrow }^{{{{\dagger}}} }{c}_{{{{{{{{{\bf{k}}}}}}}}}_{h},\downarrow\! }{S}_{x}^{2}\left|\downarrow \right\rangle$$
(19)
$${S}_{y}^{2}\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle= -\frac{1}{4}\Big[3\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle \\ +2\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}}\left({\delta }_{{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{h}}\left|{{{{{{{{\bf{k}}}}}}}}}_{e2},\, {{{{{{{\bf{k}}}}}}}},\, {{{{{{{\bf{k}}}}}}}}\right\rangle -{\delta }_{{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}}\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\, {{{{{{{\bf{k}}}}}}}},\, {{{{{{{\bf{k}}}}}}}}\right\rangle \right)\Big]\\ +{c}_{{{{{{{{{\bf{k}}}}}}}}}_{e1},\uparrow }^{{{{\dagger}}} }{c}_{{{{{{{{{\bf{k}}}}}}}}}_{e2},\uparrow }^{{{{\dagger}}} }{c}_{{{{{{{{{\bf{k}}}}}}}}}_{h},\downarrow\! }{S}_{y}^{2}\left|\! \downarrow \right\rangle$$
(20)

Which leads to

$${S}^{2}\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle= \left(\frac{N-1}{2}\right)\left(\frac{N-3}{2}\right)\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle \\ +\hat{M}\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle,$$
(21)
$$\hat{M}\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}\right\rangle=\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}}\left(-{\delta }_{{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{{\bf{k}}}}}}}}}_{h}}\left|{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{\bf{k}}}}}}}},\,{{{{{{{\bf{k}}}}}}}}\right\rangle+{\delta }_{{{{{{{{{\bf{k}}}}}}}}}_{e2},\,{{{{{{{{\bf{k}}}}}}}}}_{h}}\left|{{{{{{{{\bf{k}}}}}}}}}_{e1},\,{{{{{{{\bf{k}}}}}}}},\,{{{{{{{\bf{k}}}}}}}}\right\rangle \right)$$
(22)

It is easy to verify that the operator \(\hat{M}\) defined on the second line satisfies \({\hat{M}}^{2}=(N-1)\hat{M}\), hence its eigenvalues are 0 and N − 1. The former yields S2 eigenvalue \(\left(\frac{N-1}{2}\right)\left(\frac{N-3}{2}\right)\) which corresponds to a total spin \(S=\frac{N-3}{2}\) whereas the latter yields S2 eigenvalue \(\left(\frac{N-1}{2}\right)\left(\frac{N+1}{2}\right)\) which corresponds to a total spin \(S=\frac{N-1}{2}\).

Properties of the electron-magnon basis

In this section, we discuss the properties of the electron-magnon basis introduced in the main text, repeated here for completeness

$$\left|{{{{{{{{\bf{k}}}}}}}}}_{0};{{{{{{{\bf{q}}}}}}}},\,n\right\rangle={c}_{{{{{{{{{\bf{k}}}}}}}}}_{0}+{{{{{{{\bf{q}}}}}}}},\,\uparrow }^{{{{\dagger}}} }{a}_{n,{{{{{{{\bf{q}}}}}}}}}^{{{{\dagger}}} }\left|\downarrow \right\rangle$$
(23)

First, note that the state \(\left|{{{{{{{{\bf{k}}}}}}}}}_{0};0,\,0\right\rangle\) is a single-particle state with \(S=\frac{N-1}{2}\) since \({a}_{0,{{{{{{{\bf{q}}}}}}}}=0}^{{{{\dagger}}} }\) is simply the generator of a uniform spin rotation which increases Sz of the ground state by 1. We note that the basis (23) contains N2 states for a given k0 which is more than the size of the Hilbert space given by \(\frac{N(N-1)}{2}\). The reason is this basis is that it is not orthonormal. Instead, the overlap of states is given by

$${g}_{{{{{{{{\bf{q}}}}}}}},\,{{{{{{{\bf{q}}}}}}}}^{\prime} }^{nm}\left({{{{{{{{\bf{k}}}}}}}}}_{0}\right) =\langle {{{{{{{{\bf{k}}}}}}}}}_{0};{{{{{{{\bf{q}}}}}}}},\,n|{{{{{{{{\bf{k}}}}}}}}}_{0};{{{{{{{\bf{q}}}}}}}}^{\prime},\, m\rangle \\ ={\delta }_{mn}{\delta }_{{{{{{{{\bf{q}}}}}}}},\,{{{{{{{\bf{q}}}}}}}}^{\prime} }-{\phi }_{m{{{{{{{\bf{q}}}}}}}}^{\prime} }^{*}\left({{{{{{{{\bf{k}}}}}}}}}_{0}+{{{{{{{\bf{q}}}}}}}}\right){\phi }_{n{{{{{{{\bf{q}}}}}}}}} \left({{{{{{{{\bf{k}}}}}}}}}_{0}+{{{{{{{\bf{q}}}}}}}}^{\prime} \right)$$
(24)

This overlap can be identified with the matrix elements of the operator \(1-\hat{F}\) where \(\hat{F}\) is the operator that exchanges two electrons. The N2 basis states for a given k0 include \(\frac{N(N-1)}{2}\) fermionic (antisymmetric) states with g(k0) eigenvalues 2 and \(\frac{N(N+1)}{2}\) bosonic (symmetric) states with g(k0) eigenvalues 0. Since the exchange operator \(\hat{F}\) commutes with the Hamiltonian, we can obtain the physical Hilbert space (3) simply by restricting to the eigenstates of the Hamiltonian with g(k0) eigenvalue 2.

Derivation of the topological electron-magnon coupling at small momenta

Our purpose in this section is to derive the form of the electron-magnon coupling at small momenta, Eq. (7). The magnon creation operator is defined in Eq. (4), repeated here for completeness

$${a}_{n,{{{{{{{\bf{q}}}}}}}}}^{{{{\dagger}}} }=\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}}{c}_{\uparrow,{{{{{{{\bf{k}}}}}}}}}^{{{{\dagger}}} }{c}_{\downarrow,{{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}}}{\phi }_{n,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})$$
(25)

where ϕn,q(k) is the complete orthonormal set of eigenfunctions of the soft mode Hamiltonian defined as

$${{{{{{{{\mathcal{H}}}}}}}}}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}^{\prime},{{{{{{{\bf{k}}}}}}}})=\left\langle {c}_{\downarrow,{{{{{{{\bf{k}}}}}}}}^{\prime}+{{{{{{{\bf{q}}}}}}}}}^{{{{\dagger}}} }{c}_{\uparrow,{{{{{{{\bf{k}}}}}}}}^{\prime} }{{{{{{{{\mathcal{H}}}}}}}}}_{V}{c}_{\uparrow,{{{{{{{\bf{k}}}}}}}}}^{{{{\dagger}}} }{c}_{\downarrow,{{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}}}\right\rangle,$$
(26)
$$\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}^{\prime} }{{{{{{{{\mathcal{H}}}}}}}}}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}},{{{{{{{\bf{k}}}}}}}}^{\prime} ){\phi }_{n,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}^{\prime} )={\xi }_{n,{{{{{{{\bf{q}}}}}}}}}{\phi }_{n,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})$$
(27)

We notice that gauge invariance requires that ϕn,q(k) transforms the same way as λq(k) under gauge transformations. That is, under \({c}_{{{{{{{{\bf{k}}}}}}}},\sigma }\,\mapsto\, {c}_{{{{{{{{\bf{k}}}}}}}},\sigma }{e}^{i{\theta }_{k}}\), \({\phi }_{n,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\,\mapsto\, {e}^{-i[{\theta }_{{{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}}}-{\theta }_{{{{{{{{\bf{k}}}}}}}}}]}{\phi }_{n,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\). This means we can define a gauge invariant \({\tilde{\phi }}_{n,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\) via

$${\phi }_{n,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})={\tilde{\lambda }}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}){\tilde{\phi }}_{n,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}),\qquad {\tilde{\lambda }}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})=\frac{{\lambda }_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})}{|{\lambda }_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})|}$$
(28)

where we used the phase of the form factor \({\tilde{\lambda }}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\) rather than its full value to maintain the normalization of the wavefunctions. It is easy to show that in the limit q → 0, \({\phi }_{0,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\to \frac{1}{\sqrt{N}}\). This is nothing but the statement that the Goldstone mode in the limit of long wavelength reduces to the spin raising operator. Thus, we can write

$${\tilde{\phi }}_{0,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\,\approx\, \frac{1}{\sqrt{N}}[1+i{{{{{{{\bf{q}}}}}}}}\cdot {{{{{{{\bf{v}}}}}}}}({{{{{{{\bf{k}}}}}}}})+O({{{{{{{{\bf{q}}}}}}}}}^{2})]$$
(29)

Crucially, we can show that the Hamiltonian \({\tilde{{{{{{{{\mathcal{H}}}}}}}}}}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}^{\prime},\,{{{{{{{\bf{k}}}}}}}})={\tilde{\lambda }}_{{{{{{{{\bf{q}}}}}}}}}^{*}({{{{{{{\bf{k}}}}}}}}){{{{{{{{\mathcal{H}}}}}}}}}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}^{\prime},\,{{{{{{{\bf{k}}}}}}}}){\tilde{\lambda }}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\) is periodic and smooth in k, and so is \({\tilde{\phi }}_{n,{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\) and v(k). This can be seen by writing the transformed Hamiltonian \({\tilde{H}}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}},\,{{{{{{{\bf{k}}}}}}}}^{\prime} )={\tilde{\lambda }}_{{{{{{{{\bf{q}}}}}}}}}^{*}({{{{{{{\bf{k}}}}}}}}){H}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}},\,{{{{{{{\bf{k}}}}}}}}^{\prime} ){\tilde{\lambda }}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}^{\prime} )\) and noting that it only depends on gauge-invariant combinations of \({\tilde{\lambda }}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\) which can be written in terms of the Berry curvature, which is periodic and smooth in k. Substituting (28) and (29) in Eq. 6 in the main text and using the small q expansion of the form factor λq(k) ≈ 1 + iqA(k) + O(q2) yields

$${C}_{{{{{{{{\bf{q}}}}}}}},\,{{{{{{{\bf{q}}}}}}}}^{\prime} }^{00}= \frac{i}{N}\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}}\Big\{{q}_{\mu }^{\prime}[{A}^{\mu }({{{{{{{\bf{k}}}}}}}})-{A}^{\mu }({{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}})]\\ +\,{q}_{\mu }[{A}^{\mu }({{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}}^{\prime} )-{A}^{\mu }({{{{{{{\bf{k}}}}}}}})]+{q}_{\mu }[{v}^{\mu }({{{{{{{\bf{k}}}}}}}} +{{{{{{{\bf{q}}}}}}}}^{\prime} )-{v}^{\mu }({{{{{{{\bf{k}}}}}}}})]\Big\}\\= \, i{{{{{{{{\bf{q}}}}}}}}}_{\mu }{{{{{{{{\bf{q}}}}}}}}}_{\nu }^{\prime}\int \frac{{d}^{2}{{{{{{{\bf{k}}}}}}}}}{{A}_{{{{{{{{\rm{BZ}}}}}}}}}}[{\partial }_{\mu }{A}_{\nu }({{{{{{{\bf{k}}}}}}}})-{\partial }_{\nu }{A}_{\mu }({{{{{{{\bf{k}}}}}}}})]\\= \, i{{{{{{{\bf{q}}}}}}}}\wedge {{{{{{{\bf{q}}}}}}}}^{\prime} \int \frac{{d}^{2}{{{{{{{\bf{k}}}}}}}}}{{A}_{{{{{{{{\rm{BZ}}}}}}}}}}{{\Omega }}({{{{{{{\bf{k}}}}}}}})=i\frac{2\pi C}{{A}_{{{{{{{{\rm{BZ}}}}}}}}}}{{{{{{{\bf{q}}}}}}}}\wedge {{{{{{{\bf{q}}}}}}}}^{\prime}$$
(30)

On going from the first to the second line, we used the periodicity of v(k) to shift the momentum summation leading to \({\sum }_{{{{{{{{\bf{k}}}}}}}}}{v}^{\mu }({{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}}^{\prime} )-{\sum }_{{{{{{{{\bf{k}}}}}}}}}{v}^{\mu }({{{{{{{\bf{k}}}}}}}})=0\) (notice that this does not work for Aμ(k), which cannot be periodic in a band with finite Chern number). In the last equality, we used the definition of the Chern number ∫d2k Ω(k) = 2πC.

Dispersion

Here, we discuss some details about the dispersion we used in the main text. At integer fillings and if we ignore the inter-Chern part of the dispersion (the decoupled Chern sector approximation), the single particle dispersion is given29,30 by diagonalizing the Hartree–Fock Hamiltonian12,28 which takes the form

$${H}_{{{{{{{{\rm{HF}}}}}}}}}[Q]({{{{{{{\bf{k}}}}}}}})= \frac{1}{2A}\mathop{\sum}\limits_{{{{{{{{\bf{G}}}}}}}}}{V}_{{{{{{{{\bf{G}}}}}}}}}{{{\Lambda }}}_{{{{{{{{\bf{G}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}^{\prime} }{{{{{{{\rm{tr}}}}}}}}{{{\Lambda }}}_{-{{{{{{{\bf{G}}}}}}}}}({{{{{{{\bf{k}}}}}}}}^{\prime} ){Q}_{{{{{{{{\bf{k}}}}}}}}}\\ -\frac{1}{2A}\mathop{\sum}\limits_{{{{{{{{\bf{q}}}}}}}}}{V}_{{{{{{{{\bf{q}}}}}}}}}{{{\Lambda }}}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}){Q}_{{{{{{{{\bf{k}}}}}}}}+{{{{{{{\bf{q}}}}}}}}}{{{\Lambda }}}_{{{{{{{{\bf{q}}}}}}}}}{({{{{{{{\bf{k}}}}}}}})}^{{{{\dagger}}} }$$
(31)

Here, Qk is a matrix with eigenvalues ±1 describing a Slater determinant state such that ±1 correspond to full/empty electronic states. Λq(k) is a matrix for form factors with spin (s), sublattice (σ), and valley (τ) indices which can be transformed into a Chern (γ), spin (s), and pseudspin (η) basis (see refs. 12, 20, 38). Since our analysis focuses on a single Chern sector, we are going to neglect the Chern off-diagonal terms in the form factor Λ, which was shown in ref. 12 to be relatively small (they vanish identically in the chiral limit). In this limit, Λq(k) takes the simple form \({{{\Lambda }}}_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}})={s}_{0}\otimes {\eta }_{0}\otimes {{{{{{{\rm{diag}}}}}}}}{({\lambda }_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}),\,{\lambda }_{{{{{{{{\bf{q}}}}}}}}}^{*}({{{{{{{\bf{k}}}}}}}}))}_{\gamma }\) where λq(k) are the form factors for a single Chern band. At integer filling ν and ignoring inter-Chern dispersion, the family of the ground state is described Qk can be chosen to be k-independent and to satisfy \({{{{{{{\rm{tr}}}}}}}}Q=2\nu\) and [Q, Λq(k)] = 0, which is equivalent to the condition that Q is Chern-diagonal, i.e., [Q, γz] = 012. Under these conditions, the Hamiltonian simplifies to

$${H}_{{{{{{{{\rm{HF}}}}}}}}}[Q]({{{{{{{\bf{k}}}}}}}})={\epsilon }_{\nu,\pm }({{{{{{{\bf{k}}}}}}}})=\pm {\epsilon }_{F}({{{{{{{\bf{k}}}}}}}})+\nu {\epsilon }_{H}({{{{{{{\bf{k}}}}}}}}),$$
(32)
$${\epsilon }_{H}({{{{{{{\bf{k}}}}}}}})=\frac{1}{A}\mathop{\sum}\limits_{{{{{{{{\bf{G}}}}}}}}}{V}_{{{{{{{{\bf{G}}}}}}}}}{\lambda }_{{{{{{{{\bf{G}}}}}}}}}({{{{{{{\bf{k}}}}}}}})\mathop{\sum}\limits_{{{{{{{{\bf{k}}}}}}}}^{\prime} }{\lambda }_{-{{{{{{{\bf{G}}}}}}}}}({{{{{{{\bf{k}}}}}}}}^{\prime} ),$$
(33)
$${\epsilon }_{F}({{{{{{{\bf{k}}}}}}}})=\frac{1}{2A}\mathop{\sum}\limits_{{{{{{{{\bf{q}}}}}}}}}{V}_{{{{{{{{\bf{q}}}}}}}}}|{\lambda }_{{{{{{{{\bf{q}}}}}}}}}({{{{{{{\bf{k}}}}}}}}){|}^{2}$$
(34)

where the positive (negative) sign is for the electron (hole) bands. We note that ϵν = −ϵν, due to particle-hole symmetry so that electron (hole) bands on the ν > 0 side map to hole (electron) bands on the ν < 0 side. That is, doping away from charge neutrality is the same whether for positive and negative ν and similarly for doping towards neutrality. The Hartree and Fock potentials are plotted in Fig. 4a, and we can see that both are characterized by a dip at Γ. Thus, for doping away from neutrality, the two are going to add, while on doping towards neutrality, they subtract. In the main text, we used ν as an interpolation parameter that also takes non-integer values as a proxy for tuning the bandwidth. We can see in Fig. 4b the bandwidth as a function of ν, and we see that there is a minimum in the range ν [−1.5, −1] depending on the chiral ratio κ. The value of \(\nu={\nu }_{\min }\) for which the bandwidth is minimum is shown in Fig. 4c. We note that for \(\nu \, > \, {\nu }_{\min }\), the band minimum is at Γ whereas for \(\nu \, < \, {\nu }_{\min }\), the band maximum is at Γ.

Fig. 4: Details of the approximate Hartree–Fock dispersion.
figure 4

a Hartree ϵH(k) and Fock ϵF(k) dispersion. b Bandwidth for the dispersion ϵν(k) as a function of ν for different values of κ. c Value of ν for which the bandwidth is minimum as a function of κ.