Introduction

A thorough understanding of protein aggregation behaviour in solution is crucial for the controlled manufacture and formulation of therapeutics and for the development of crystallization-based purification strategies for biological drugs1,2,3,4. Unfortunately, while the state diagram of protein-solvent systems can provide indications about the phase behaviour of biomacromolecules and their stability in solution, the intimate mechanism underlying protein nucleation and crystal growth have not been fully understood yet and, in many cases, it remains elusive. Protein aggregation is directly related to protein-protein interactions and may be strongly related to protein crystallization behaviour5. Since different types of intermolecular forces are involved in protein interactions, such as electrostatic, van der Waals and hydrophobic6, environmental factors like pH, ionic strength and the presence of additives, can drastically alter the intermolecular interaction profile of proteins in solution7,8.

Colloidal and conformational stabilities are the two main factors that govern the existence of a protein aggregate in a solution9,10. Colloidal stability is a delicate balance between repulsive and attractive forces between proteins, while conformational stability is related to the free energy difference between two molecular states, folded and partially or totally unfolded. Several investigations have found that the melting temperature (Tm) and the osmotic second virial coefficient (B22) can respectively account for the protein conformational11 and colloidal12 stability. Both Tm and B22 can be experimentally determined by using correspondingly Dynamic Light Scattering (DLS) and Static Light Scattering (SLS) methods13. Aggregation induced by thermal unfolding can be monitored by measuring the average hydrodynamic diameter value as a function of temperature, while B22 describes the first deviation from ideal behaviour of dilute colloidal solutions14. A positive or negative value of B22 accounts for a net repulsive or attractive interaction balance, respectively, and could therefore be used to predict stable, crystallizing or precipitating protein solution conditions. According to the experimental crystallization slot of George and Wilson15, a protein solution would be: stable if B22 > −0.8 × 10−4 mL mol g−2; crystallizing if −8.4 × 10−4 < B22 < −0.8 × 10−4 mL mol g−2; precipitating if B22 < −8.4 × 10−4 mL mol g−2. Haas and Drenth16 postulated a modified crystallization slot, ranging from −0.9 × 10−4 to −0.35 × 10−4 mol mL g−2 for protein with a molecular size as high as 140 kDa.

The Z-potential is another indicator of colloidal stability, which measures the magnitude of the electrostatic or charge repulsion/attraction between particles17. It is related to the surface charge of the molecules, the adsorbed layer at the interface and the nature and composition of the surrounding environment. A large Z-potential value (|Z-potential| >30 mV) can be considered an indicator of colloidal stability. Recently, the use of the interaction parameter kd has been also suggested as a more high-throughput means to quantify protein-protein interactions18,19.

In this work, we focus on the relationships between the crystallization propensity of Anti-CD20 in the Na2SO4/PEG400 solvent system of different compositions and its relation with some measurable parameters normally used to assess protein conformational and colloidal stability in solution, with the aim to understand the aggregation mechanism of this complex biomacromolecule. Anti-CD20 is a full-length monoclonal antibody (mAb) and it is a therapeutic protein extensively used for treatment of chronic lymphocytic leukaemia and non-Hodgkin’s lymphoma, which is commercialized with the brand name of Rituxan or Mabthera. Here, the crystallization of this biomolecule has been observed in a PEG400/Na2SO4/Water system near Liquid-Liquid Phase Separation (LLPS) conditions by both vapour diffusion and batch methods. LLPS is a very common phenomena for monoclonal antibodies20 and is of great interest because the factors which induce the separation of protein solutions into coexisting protein-poor and protein-rich phases are observed to play a central role in protein interaction and crystallization21. For several protein/PEG or protein/salt systems both the LLPS and the aggregation propensity of proteins in solution have been shown to coincide with net attractive interactions between molecules22,23,24. Monte Carlo simulations performed on systems that undergo phase separation demonstrated that the interactions between the proteins fall into the short-range regime and are strongly anisotropic25. Over time, the anisotropic interactions may be responsible for the evolution of the system from the phase separation to reversible aggregation conditions. In this situation, the LLPS has been speculated to behave as a metastable state towards nucleation26,27,28. In fact, the pathway for the formation of crystal nuclei drastically changes near the metastable low density–high density liquid region coexistence, with the free-energy barrier for crystal nucleation being strongly reduced. In this scenario, the crystal nucleation rate increases by many orders of magnitude over that predicted from classical nucleation theory28, thus generating a fast and diffuse crystallization that often results in poor-quality crystals.

The results of this investigation allowed us to gain further insights into the mechanism governing the nucleation-crystallization of a complex system such as the Anti-CD20 mAb.

Methods

Anti-CD20 crystallization experiments

Anti-CD20 monoclonal antibody (MW = 144.5 kDa) was kindly provided by FUJIFILM Diosynth Biotechnologies (Billingham, UK) at high level of purity (>98% HPLC, >95% SDS-PAGE) in 0.025 M sodium citrate buffer at pH 6.5 and 0.154 M sodium chloride solution. Sodium sulphate anhydrous (purity ≥ 99.99%), PEG400, HEPES (purity ≥ 99.5%), TRIS (purity ≥ 99.5%), magnesium chloride hexahydrate (≥99%) and NaOH (purity ≥ 98%), from Sigma-Aldrich (Italy), were used without further purification. MilliQ water was used to prepare all solutions. Anti-CD20 was thawed for 2–3 hours in an 8 °C water bath before buffer exchange with HEPES 0.1 M and concentration by Amicon Ultra centrifugal filter tube (Ultra-4, cut-off 10 kDa, from Merck) at 6,000 g and 8 °C. Finally, the protein solution was diluted to the desired concentration by adding HEPES buffer 0.1 M. NaOH 1 M was used to adjust the pH to the desired value. The Anti-CD20 concentration was determined by measuring protein absorbance at 280 nm by UV-Vis spectrophotometer (extinction coefficient at 280 nm 237,380 M−1 cm−1).

Protocols suitable for crystallizing Anti-CD20 reproducibly by vapour diffusion and batch methods were developed in previous works29,30. Precipitant solutions were made by dissolving sodium sulphate in the HEPES buffer solution containing PEG400 at the proper concentration. All solutions were filtered through a 0.22 µm Anotop 10 Filter Unit (Whatman). In sitting-drop vapour diffusion crystallization experiments, drops made of 10 μL of precipitant solution and 10 μL of protein solutions were equilibrated in a sealed well against 1 mL of a hypertonic solution of MgCl2·6H2O 30 wt.% used as reservoir. For the batch crystallization experiments, a volume of protein solutions having a certain initial concentration was mixed with the required volume of precipitant solution to obtain a supersaturated solution of known composition (final volume 1 mL). All crystallization experiments were performed in a vibration-free refrigerator at 20 °C.

Analysis of Anti-CD20 crystals

Crystals obtained from crystallization experiments were observed using an optical microscope (Eclipse LV 100ND, Nikon Instruments, Italy) equipped with a video camera, separated from the mother liquor by using a centrifuge (MiniSpin Plus Eppendorf model IVD) and centrifuge tubes (0.22 µm pore size) and extensively washed. Washing solution was made by 0.1 M HEPES, Na2SO4 and PEG400 whose concentration was optimized to avoid crystal solubilisation according to standard procedures. Washed crystals were dissolved by using 0.01 M TRIS-HCl, NaCl 0.15 M, pH 8.0 and analysed by UV spectrophotometry (Lambda EZ201, Perkin Elmer) to estimate protein concentration.

Circular dichroism (CD) measurements

The far-UV CD spectra of Anti-CD20 under initial conditions, after buffer exchange with HEPES 0.1 M, and after mixing with the precipitant solution, were recorded between 200 and 300 nm on a Chirascan CD spectrometer (Applied Photophysics, UK) at 20 °C with a 1 nm bandwidth resolution and current time-per-point of 3 s. 100 μL of mAbs solution were added in a 0.1 mm path-length quartz cuvette that was placed in the spectrophotometer. Typically, three scans were recorded and baseline spectra were subtracted from each spectrum. Data were processed using Applied Photophysics Chirascan Viewer.

Melting point and Z-potential assessment by DLS

Thermal stability (melting temperature, Tm) of protein solutions was assessed by a Zetasizer Nano ZS instrument (Malvern) equipped with a 4 mW He–Ne laser at 632.8 nm and a detector placed at 173° in agreement to the proprietary NIBS (Non Invasive Back Scatter) technology. Samples were placed in quartz cuvettes (1 cm path-length), heated by a Peltier module and the average hydrodynamic diameter was measured as a function of temperature from 20 °C to 90 °C (temperature increment 0.5 °C). DLS can monitor the thermal denaturation of proteins, which leads to irreversible loss of their structure and function as the protein unfolds. When the average protein size is observed by DLS versus temperature, one will observe a significant increase in the average protein size for temperatures larger than the aggregation point, which is generally assumed as the onset of protein melting point.

For the Z-potential evaluations, the appropriate ratios of protein and buffer solutions were placed in disposable folded capillary cells (DTS 1070, Malvern), and then inserted into the Zetasizer instrument. The Z-potential was measured by laser Doppler micro-electrophoresis. Briefly, an electric field was applied to the protein solution and consequently each molecule moved with a velocity related to its Z-potential. The velocity was measured using a laser interferometric technique (M3-Phase analysis light scattering), which allowed the calculation of the electrophoretic mobility, and from this, the average Z-potential and Z-potential distribution.

Light scattering measurements

Static and dynamic light scattering measurements were performed on protein solution samples with a concentration between 0.1 and 5.0 mg mL−1 in different crystallization cocktails. The Anti-CD20 concentration in these solutions was checked by measurement of the absorbance at 280 nm. All the solutions were filtered on a 0.22 µm filter (Anotop 10, Whatman) before measurement. The samples were placed in quartz ultra-low volume cuvettes (ZEN 2112, optical path 3 mm, Malvern) to measure with the Zetasizer Nano ZS instrument (Malvern) in static mode for the second virial coefficient B22 or dynamic mode for the protein diffusion coefficient Dm. The light intensity and its time autocorrelation function were measured at 173° scattering angle. All measurements were performed at 20 °C after 2 min of equilibration using automatic time settings.

The Debye plots were generated by using Debye’s light scattering equation:

$$\frac{KC}{{R}_{\vartheta }}=\frac{1}{{M}_{W}}+2{B}_{22}C$$
(1)

where Rθ is the excess Rayleigh ratio of the protein in a solution with a protein concentration C and MW is the average molecular weight of the protein. K is the optical constant and is defined as:

$$K=\frac{4{\pi }^{2}{n}^{2}{\left(\frac{dn}{dc}\right)}^{2}}{{N}_{A}{\lambda }^{4}}$$
(2)

where n is the solvent refractive index, dn/dc is the refractive index increment, λ is the wavelength of the incident light, and NA is Avogadro’s number. The second virial coefficient B22 [mL mol g−2] was obtained from the slope of the linear Debye plot of KC/Rθ versus protein concentration C.

The static light scattering results for B22 were accepted if the following criteria were satisfied:

  1. 1.

    Signal to noise ratio >130%, where noise is solvent kcounts;

  2. 2.

    Increasing trend for signal kcounts vs. protein concentration.

The interaction parameter kd [mL mg−1] and the self-diffusion coefficient Ds [m2 s−1] were determined by using linear fitting of the protein diffusion coefficient Dm, determined from the dynamic light scattering measurements, plotted against the protein concentration C [mg mL−1], using the following Eq. 331:

$${D}_{m}={D}_{s}(1+{k}_{d}C)$$
(3)

The determined interaction parameter kd can be interpreted as the difference between a thermodynamic term related to the product 2B22·MW and a hydrodynamic term (ξ1 + ν):

$${k}_{d}=2{B}_{22}{M}_{W}-({\xi }_{1}+\upsilon )$$
(4)

where ξ1 is obtained from the virial expansion of the concentration-dependent frictional coefficient and ν is the solvent viscosity.

Solutions that showed turbidity, indicating phase separation and crystallization/precipitation, were not used for measurements.

Protein partitioning measurements

Anti-CD20 repartition between aqueous and PEG phases has been estimated by the method described by Kress et al.32. A solution (5 mL) containing Na2SO4 1.1 M (13.54 wt.%) and PEG400 10.46% V/V (10.22 wt.%) in HEPES 0.1 M at pH 7.4, was added with 200 μL Anti-CD20 solution 10.3 mg mL−1 in the same buffer. After vigorous stirring for 5 minutes by vortex, the mixture has been left to rest overnight. The concentration of the protein in the two separated phases visually detected and in presence of excess of precipitated protein, was then assessed by UV spectrophotometry at 280 nm.

Results and discussion

Anti-CD20 crystallization and crystals analysis

Anti-CD20 needle-like crystals (Fig. 1a) appear by sitting-drop vapour diffusion crystallization experiments using a 1:1 volume ratio mixture of a 20 mg mL−1 protein solution in HEPES buffer 0.1 M at pH 7.4 and a precipitant solution containing PEG400 9.6% V/V (9.4 wt.%) and Na2SO4 0.86 M (10.6 wt.%) in the same buffer. Therefore, the initial composition of the drop in the vapour diffusion tests corresponds to 10 mg mL−1 Anti-CD20, 4.8% V/V PEG400, 0.43 M Na2SO4 in 0.1 M HEPES at a pH of 7.4 and is shown by the diamond symbol in Fig. 2. The crystals appear when the volume of the drop is nearly halved with respect to its initial volume. Accordingly, the final concentration of the components in the crystallizing drops is almost doubled that compared to the initial compositions and it is indicated by the pentagon symbol of Fig. 2.

Figure 1
figure 1

Optical microscope image of (a) crystalline Anti-CD20 monoclonal antibody obtained by sitting-drop vapour diffusion method and (b) Liquid-Liquid Phase Separation occurring just prior to crystallization. Solution composition near the crystallization point: 20 mg mL−1 Anti-CD20, 9.6% V/V PEG400, 0.86 M Na2SO4 in 0.1 M HEPES buffer at pH 7.4.

Figure 2
figure 2

Experimental state diagram for the system PEG400/Na2SO4/H2O at 293.15 K. Squares: experimental liquid-liquid equilibrium data from ref. 36; solid line: asymptotic fitting to tie-lines data from ref. 36; circles: batch crystallization conditions reported in Table 1; diamond: starting solution composition in vapour diffusion tests; pentagon: ending solution composition in vapour diffusion tests; triangles: measurements points for B22; dotted line: shifted binodal curve in the presence of Anti-CD20 protein. Solution composition is expressed as weight fraction (wt.).

Batch experiments in similar solution composition as that for crystallizing drops in the vapour diffusion tests (the term “similar” is used since the exact composition of the droplet at the crystallization point under equilibration with the reservoir solution it is not known), give also rise to crystals. Therefore, the batch working point coincides with the final conditions of the vapour diffusion tests. Since the batch composition does not vary with time, due to the absence of solvent evaporation, it is evident that the solution is supersaturated immediately upon mixing the protein and precipitant solutions, which results in a shorter crystallization time (12–18 h) than that observed with the sitting drop set up (36–48 h).

The UV-Vis and DLS measurements of washed and dissolved crystals confirmed their protein nature. All the analysed samples showed an intense band centred at 280 nm, typical of proteins absorption. The size distribution analysis by intensities show a single peak near 12 nm, size expected for Anti-CD20 protein molecules33, and a polidispersity <20%, indicating a high monodispersity.

Liquid-Liquid Phase Separation (LLPS) is observed nearly simultaneously with, or just prior to, crystals appearance (Fig. 1b) both in the vapour diffusion and batch crystallization trials. In accordance with other mAbs crystallization studies reported in literature20,34,35, also here it appears that the LLPS is a key step in Anti-CD20 crystallization.

Crystallization trials, aiming to investigate the effect of precipitant solution composition on crystal quality have been also performed by changing the starting amount of PEG400 and/or Na2SO4. As expected, the shape, size and density of crystals (number of crystals per unity of volume) are found to depend on buffer composition, initial protein concentration and crystallization method (sitting drop or batch). All these parameters have also shown to impact on the time required to observe crystallization. Table 1 shows the effect of the salt concentration on the crystal appearance for the batch experiments. Needles with a size of nearly 20 µm are the largest crystals obtained for starting solutions with 9.6% V/V PEG400 and 0.70–0.80 M Na2SO4. A further increase in Na2SO4 concentration produces a faster and massive protein nucleation in less than 24 hours and crystals of smaller size (<10 µm) with a spherulite-like morphology appear. For a Na2SO4 concentration of 0.9 M or larger, upon mixing the solution immediately turns opalescent due to a macroscopic LLPS in the solution, followed by a rapid protein precipitation as gel beads. On the opposite side, for the lowest amount of salt (Na2SO4 ≤ 0.65 M), the solutions remain clear for months.

Table 1 Mixed solution compositions, crystallization behaviour and crystal appearance in the batch crystallization experiments using an Anti-CD20 concentration 20 mg mL−1, 9.6% V/V PEG400 and 100 mM HEPES buffer at a pH of 7.4.

Figure 2 shows the experimental state diagram for the system PEG400/Na2SO4/H2O at 20 °C indicating the loci of the points related to solution compositions used for batch crystallization tests (circles) of Table 1, together with the starting (diamond) and ending (pentagon) solution compositions in vapour diffusion tests. The figure also displays the experimental liquid-liquid equilibrium data and the asymptotic fitting to tie-lines data reported in ref. 36.

SLS/DLS analysis

The protein stability in several formulation buffers but without precipitating agents (PEG and/or Na2SO4), was assessed by DLS, resulting in values for the melting point and Z-potential. As reported in Table 2, all samples are characterized by similar values of melting point, Z-potential, intensity weighted mean hydrodynamic size Zavg and polydispersion index (PDI). All protein samples show low positive values of the Z-potential, indicating a reduced colloidal stability in the various buffer solutions. Nevertheless, due to a PDI close to 0.1, all samples can be considered monodisperse. No significant variations were observed in the sample aggregation state over time in the range 12–24 h. Accordingly, further SLS/DLS studies were performed by using HEPES as formulation buffer since it was used in previously published crystallization protocols30.

Table 2 Results for melting point (Tm), Z-potential, intensity weighted mean hydrodynamic size (Zavg), and polydispersion index (PDI) from formulation buffers screening (Anti-CD20 concentration 20 mg mL−1; aHEPES buffer 0.1 M, pH 7.4; bSodium Citrate buffer 0.035 M, NaCl 0.15 M, pH 6.5; cTRIS buffer 0.1 M, NaCl 0.15 M, pH 8.0).

To study the aggregation propensity on Anti-CD20 in HEPES buffer and in various PEG400/Na2SO4 solution compositions, SLS and DLS measurements were performed on freshly prepared samples allowing to determine the second virial coefficient B22 and the interaction parameter kd, respectively. The value of KC/Rθ was determined as a function of protein concentration, ranging from 0.1 to 5.0 mg mL−1, by SLS. Results were used to evidence the combined effect of PEG400 and Na2SO4 on Anti-CD20 colloidal stability and to compare these results with effective crystallization outcomes.

It is well known that the addition of salts can cause an electrostatic double layer around the protein surface charges, which involves a shielding and the reduction of repulsive interactions among macromolecules. Furthermore, the salts ions compete with the protein for water molecules and dehydrate the protein (salting out effect37), thus inducing aggregation by mainly electrostatic and hydrophobic interactions. On the other side, polymers such as PEG bring the protein molecules together due to preferential interactions and osmotic potential38. Accordingly, the addition of PEG400 or Na2SO4 to Anti-CD20 solutions is expected to increase the aggregation aptitude of the protein and facilitate crystallization, as experimentally observed, e.g., in batch crystallization tests of Table 1. However, as displayed in Figs. 3 and 4, for increasing amounts of PEG or Na2SO4 when both components are present in solution, the B22 moves towards larger positive values, theoretically indicative of an increase in repulsive interactions. Furthermore, effective Anti-CD20 crystallization conditions seems to lay outside the crystallization slot defined by George and Wilson15.

Figure 3
figure 3

The second virial coefficient B22 as a function of PEG400 volume concentration (XPEG) at a constant amount of 0.48 M Na2SO4 in HEPES 0.1 M buffer at pH 7.4. The crystallization slot according to ref. 15 is shown in light grey. The solid line is a guide for the eyes.

Figure 4
figure 4

The second virial coefficient B22 as a function of Na2SO4 molar concentration (\({C}_{N{a}_{2}S{O}_{4}}\)) for constant PEG400 volume concentration xPEG = 4.8% V/V in HEPES 0.1 M, pH 7.4. The solid square symbol indicate B22 for a PEG400 concentration of 9.6% V/V without NaCl. The crystallization slot according to ref. 15 is shown in light grey. The solid line is a guide for the eyes.

Both Figs. 3 and 4 show that the combined effect of PEG400 and Na2SO4 has a noteworthy consequence on the B22 value. For instance, for the solution containing 0.48 M Na2SO4 in absence of PEG, a B22 value of −8.9 × 10−4 mL mol g−2 is obtained (Fig. 3) that is slightly below the crystallization slot. However, when adding PEG400 to the solution while keeping salt concentration constant, B22 steeply increases through the crystallization window towards positive values already at 3% V/V of PEG.

The B22 value for the solution containing 4.8% V/V PEG400 in absence of Na2SO4 is found to be slightly positive (B22 = 1.8 × 10−4 mL mol g−2), becoming negative (B22 = −2.26 × 10−4 mL mol g−2) when increasing PEG400 up to xPEG = 9.6% V/V without Na2SO4 (Fig. 4). This is in agreement with results from Ahamed et al.39, who observed the reduction in solubility for a mAb for increasing amounts of PEG400. Nevertheless, when salt is added to the solution (\({C}_{N{a}_{2}S{O}_{4}}\ne 0\)), the virial coefficient rises steeply with the salt concentration. These observations clearly indicate that a different effect on the measured value of B22 takes place when PEG400 and Na2SO4 are combined together, compared to the cases when they are used separately.

Figure 5 shows the effect of pH on B22 for solutions containing PEG400 and Na2SO4. At pH ≤ 7.6, the B22 remains roughly unchanged (2.28 < B22 < 4.35 × 10−4 mL mol g−2). These positive values are indicative of repulsive protein–protein interactions. At a pH of 8.0, the B22 becomes negative (−3.78 × 10−4 mL mol g−2) and then sharply increases approaching pH = 8.8, which is the theoretical isoelectric point of Anti-CD20 (pI = 8.8)40.

Figure 5
figure 5

pH dependence of B22 for Anti-CD20 solutions with 4.8% V/V PEG400, 0.43 M Na2SO4, and HEPES 0.1 M. The crystallization slot according to ref. 15 is shown in light grey. The solid line is a guide for the eyes.

It is expected that protein-protein interactions at a pH substantially below or above the pI are repulsive, while at a pH near the pI they become increasingly attractive due to the overall neutral charge of the protein. However, the actual pI of biomolecules might change with ionic environment, and generally decreases at different extents with increasing ionic strength, depending on the nature of the counter-ion41,42. It is speculated that in the presence of 0.43 M Na2SO4, PEG400 4.8% V/V and 0.1 M HEPES, the effective pI of Anti-CD20 is around 8.0. Therefore, for pH < 8.0, the slightly positive B22 values are most likely due to partially screening of the net positive electrostatic charges of protein molecules, while at pH = 8.0, where the net charge becomes null, the net resultant interaction is attractive. Above pH 8.0, the B22 values become strongly positive, indicating an ineffective screening of the net negative charge of molecules that results in repulsive interactions.

Since solution composition strongly affects protein stability and aggregate formation, the Na2SO4 concentration dependence of B22 was also investigated at higher PEG and salt concentrations, thus reaching compositions close to that used for crystallization experiments of Table 1. Correlation between B22 and Na2SO4 concentration is shown in Fig. 6, confirming an increasing trend similar to that observed for lower concentrations of Fig. 4 also in this case.

Figure 6
figure 6

B22 and kd as a function of Na2SO4 concentration for xPEG = 9.6% V/V and pH 7.4. The experimental crystallization window boundaries (see Table 1) are highlighted by the dotted vertical lines. The solid lines are guides for the eyes.

In Fig. 6 it can be seen that the B22 values are negative (attractive interactions) for Na2SO4 concentration ranging from 0.55 to 0.6 M, corresponding to conditions in Table 1 that give clear solutions, while they become positive (repulsive interactions) when the salt concentration is further increased up to 0.9 M. Within the crystallization window, i.e. Na2SO4 ranging from 0.7 to 0.86 M, the B22 values range from 3.2 × 10−4 to 5.7 × 10−4 mL mol g−2. This would be theoretically indicative of the absence of attractive colloidal interactions, in opposition to the results of the crystallization tests.

DLS measurements were then used to determine the interaction parameter kd. Generally, kd and B22 display a qualitatively similar trend43. However, looking at the behaviour of the interaction parameter kd in the crystallization range of Fig. 6, negative values are found for xPEG = 9.6% V/V and sodium sulfate 0.55–0.80 M, with slight variations within this range. Generally, a negative value of kd implies the existence of attractive interactions between protein molecules which slow down their diffusion in solution44. Since under the same conditions B22 is negative only for Na2SO4 < 0.6 M and then increases up to 1.23–4.94 × 10−4 mL mol g−2 in the salt concentration range 0.6–0.8 M, this indicates that underlying attractive interactions do not have a thermodynamic origin of colloidal nature. We can instead conclude that these interactions result from the hydrodynamic effects included in the second term of Eq. 445, most likely due to the increase of solution viscosity (from 1.949 cP in the PEG400 9.6% V/V solution without salt to 2.546 cP for PEG400 9.6% V/V and Na2SO4 0.9 M). Nevertheless, as for B22, the increasing trend of kd with the concentration of Na2SO4 does not comply with the outcome of crystallization tests. In fact, the increase of Na2SO4 concentration at values larger than 0.8 M gives rise to a theoretically significant increase of the strength in protein-protein repulsions, according to measured B22, values that are exacerbated near the LLPS region, while crystals are obtained in such conditions.

From the observed trends of B22 and kd, we can conclude that neither PEG400 nor Na2SO4 act as conventional precipitants for Anti-CD20 when used together since, on average, Anti-CD20 monomers apparently repel one another. Nevertheless, this evidently contradicts the experimental outcomes of crystallization tests. The B22 values are negative (attractive interactions) in formulations that remain clears for months, whereas positive values, corresponding theoretically to net repulsive interactions, have been found for the best-performing crystallizing or either in precipitating conditions.

Considering the possible variations in their structure, morphology, size, surface charge and chemistry, proteins definitely represent one of the most complex colloidal systems. Several mechanisms could lead to molecular aggregation in solution and it is not excluded that more than one is active in certain conditions. Therefore, it is not surprising that B22 does not always reflect the aggregation propensities of proteins in a given formulation. Several papers can be found in literature reporting discrepancies between B22 and the susceptibility to aggregation of structurally complex macromolecules, like monoclonal antibodies43,46. Crystallizing conditions with positive B22 values were found for a mAb molecule with PEG400 by Rakel et al.35. This behaviour has been ascribed to the ambivalent character of PEG400 as precipitant and stabilizer35,47. Kress et al.48 observed the decrease in B22 of an immunoglobulin G (IgG) when increasing PEG2000, sodium citrate and NaCl solute concentration, if used separately. However, the mutual effect of PEG and salts resulted in a shift of B22 from negative to almost neutral values when increasing the NaCl amount if used together32. This behaviour has been explained by a solubilizing effect of IgG in PEG2000 induced by NaCl until a maximum salt concentration. Exceeding this NaCl value, a solid-liquid equilibrium, due to protein precipitation, was observed. Herhut et al.49 measured B22 for an IgG and D-xylose ketol-isomerase (172.3 kDa) with ammonium sulphate and PEG12000 and PEG2000, respectively. They observed the initial decrease in B22 at low PEG concentration, passing through a minimum and then increasing for higher PEG concentrations. This behaviour has been explained by considering that polymer-induced interactions are not solely attractive over the whole polymer concentration range. Normally, polymer chains behave as coil-inducing attractive interactions between protein molecules at low-polymer concentration. However, at high polymer concentration, flexible coils elongate and enter space between the proteins, so that repulsive interaction are induced50,51. This explanation has been also confirmed by molecular dynamic simulations at different polymer concentration by Cao et al.52.

It has also been reported that conformational stability plays an important role in aggregation propensity, with partially unfolded conformational intermediates being responsible of aggregate formation9. In this case, the B22 value is unlikely to correlate with long-term aggregation because the structurally perturbed state susceptible to aggregation could be present in a small fraction of molecules compared to the native species46. Aiming to shed light on this aspect, protein structure in solution has been investigated in the present work.

Protein structure in the solution

On the bases of the results above, measurements of melting temperature (Tm) for several solution compositions, including xPEG = 9.6% V/V and Na2SO4 at various concentrations, and Circular Dichroism (CD) in selected formulations, were performed aiming to investigate conformational stability of Anti-CD20. Results show slight variations of the melting points: Tm = 67.0, 67.5, 68.0, and 67.5 ± 0.5 °C for 0.55, 0.65, 0.75, and 0.85 M Na2SO4, respectively, confirming that samples, on average, share similar water activity according to the Wyman-Tanford equation53.

Circular Dichroism spectra of protein solutions give information on the secondary structure, folding properties and conformational variation of the proteins54. The general shape of the spectrum of the provided Anti-CD20 solution (Fig. 7, initial conditions) corresponds to the expected spectra for Anti-CD20, as it shows the typical shape of a monoclonal antibody with a minimum around 220 nm and a maximum around 200 nm, suggesting a secondary structure dominated by β-sheet motif54. Moreover, these spectra correspond to the CD-spectra of Anti-CD20 from literature55.

Figure 7
figure 7

Far-UV CD spectra of Anti-CD20 under initial conditions (blue spectra, mAbs concentration 9.95 mg mL−1 in stock buffer), after buffer exchange with HEPES 0.1 M pH 7.4 (red spectra, mAb concentration 100 mg/mL in crystallization buffer) and after salt mixing (green spectra, mAb concentration 60 mg/mL in crystallization buffer and with 0.9 M Na2SO4 and 9.6% V/V PEG400).

These spectra show that the protein is well folded in every buffer, i.e. the 3D-structure of Anti-CD20 is preserved under all conditions tested. However, the maximum ellipticity (around 200 nm) is much higher after the buffer exchange (red and green spectra in Fig. 7) than in the initial buffer (blue spectrum), while ellipticity at other wavelengths is similar in every buffer. This suggest that the β-sheet motifs of the mAbs are favoured and stabilized in the new HEPES buffer, in comparison to the initial buffer (0.025 M sodium citrate at pH 6.5 and 0.154 M sodium chloride).

There is no significant difference in the far-UV CD spectra of Anti-CD20 before (Fig. 7, red spectrum) and after the addition of Na2SO4 and PEG400 as precipitating agents (Fig. 7, green spectrum), although a small shift can be observed, showing slight changes in the 3D-structure. The main difference between these spectra is around 230 nm, but the peak is quite narrow and this wavelength does not correspond to any typical feature of protein structure. Therefore, the difference could be due to Na2SO4 or PEG400 absorbance or might be due to some noise. It should be noted, though, that the crystallization solution had to be diluted in order to enable CD measurement. Since the spectra before and after the concentrated crystallization conditions are the same, it can be concluded that the concentrated crystallization conditions do not alter the secondary structure of the protein.

The Anti-CD20 secondary structure is preserved after centrifugation and addition of the precipitating agents, as the typical β-sheet motif appears clearly in the spectrum (Fig. 7, green spectrum). Moreover, this secondary structure seems to be favoured in the crystallization buffer, meaning that the 0.1 M HEPES at pH 7.4 buffer promotes Anti-CD20 crystallization by making the secondary structure less flexible. However, no clear effect of Na2SO4 or PEG400 on the Anti-CD20 secondary structure is visible.

The role of LLPS in Anti-CD20 crystallization

A complementary scenario can be envisioned when considering LLPS phenomena observed concurrently or just prior of Anti-CD20 crystallization.

For simple protein/salts or protein/PEG systems, it has been often reported that the B22 value can be used as a predictor of LLPS formation, since weak protein-protein interactions, in general hydrophobic in nature or even limited to specific protein chain sequences, are at the origin of the phase separation22. However, our results seem to suggest that the LLPS (observed just prior or at the point of protein crystallization) proceeds without attractive interactions, since B22 is negative when a mono-phase system is observed while it turns positive as approaching the liquid phase separation and crystals formation.

It could be argued that, in our complex mixture that includes several components, the B22 values fail to predict the experimentally observed crystallization since it may be rather related to the polymer-salt phase separation that can occur without protein. The liquid-liquid separation of PEG/salt aqueous solutions is a fundamentally different phenomenon from PEG/protein or salt/protein LLPS, which originate by a combination of hydrophobic and salting-out effects, whose origin has been attributed to the different conformations of the PEG polymer chain in aqueous solution and their respective hydration24. Actually, the PEG400/Na2SO4/H2O system is known to generate aqueous two-phase systems36,56,57. Figure 2 displays the experimental binodal curve together with liquid-liquid equilibrium tie-lines data for PEG400 + Na2SO4 + H2O system at 293.15 K (the temperature used in this work for crystallization experiments and SLS/DLS measurements)36.

Concerning the specific system studied in this work, for a protein concentration of 20 mg mL−1 and a PEG400 content of 9.6% V/V (9.4 wt.%), phase separation is observed for salt concentrations ≥0.7 M (≥8.6 wt.%) (crystallization points of Table 1 and dark circles in Fig. 2), that is quite far from both binodal curve. However, the presence of the protein can reduce the amount of salt needed for observing LLPS in the multicomponent PEG/salt/protein/water systems24. Consequently, it cannot be excluded that the effective binodal curve for PEG/salt LLPS is shifted down to conditions corresponding to the observed crystallization points, in a region of poor solvent conditions, thus giving a decisive contribute to the aggregation/crystallization behaviour of Anti-CD20.

Figure 8 shows the change in turbidity (Residual Intensity) for a solution containing PEG400 9.6% V/V in 0.1 M HEPES at pH 7.4 and increasing amount of Na2SO4. It is observed that, in absence of the Anti-CD20, a sharp increase in solution turbidity is observed for salt concentration >1.1 M (>13.5 wt.%), that is very close to the binodal curve reported in Fig. 2 for the PEG400 + Na2SO4 + H2O system. Since in the presence of the protein turbidity due to liquid-liquid phase separation is observed for salt concentration larger than 0.65 M for the same amount of PEG, this proves the shift of LLPS region towards lower values of PEG and/or salt components when Anti-CD20 is present in the solution.

Figure 8
figure 8

Residual Intensity measured by DLS at a fixed scattering angle of 173° for a solution containing PEG400 9.6% V/V in 0.1 M HEPES at pH 7.4 and increasing amount of Na2SO4, without protein. The solid line is a guide for the eyes.

It is worth noting that B22 measurements reported in Figs. 3 and 4 have been made in a region of Fig. 2 where LLPS (and crystallization) are not observed (triangles in the left-bottom side of the figure). This means that PEG/salt LLPS is not responsible for the contradiction between B22 measurements and the outcome of crystallization trials, but rather the intrinsic complexity of the system makes ineffective the measurement of the osmotic coefficient as predictor parameter for crystallization conditions.

According to our results, it can be proposed that PEG/salt LLPS causes the preferential partitioning of proteins in a specific salt- or PEG-rich phase. In general, the affinity of the protein for the polymeric component, due to hydrophobic interactions, favours the protein concentration in the PEG-rich phase39,58. This is particularly true for PEG400, which is a low molecular weight polymer and, therefore, its size exclusion effects on proteins are negligible while the dominant factor in the separation is the salting out effect, produced by the sulphate, which causes the protein to migrate to the PEG-rich phase. This effect has been confirmed by measurements on Anti-CD20 partitioning, showing that more than 90% of the solution protein migrates in the PEG-rich phase in the protein-saturated Anti-CD20/PEG400/Na2SO4/HEPES/H2O mixture displaying phase separation.

In such conditions, during LLPS, the Anti-CD20 concentrates in droplets that have a density higher than the average density of the whole system thus generating a much higher supersaturation. When the protein concentration exceeds a certain limit, precipitation at the interface is observed and the precipitated protein is in equilibrium with the protein solubilised in each phase59. For particularly high Anti-CD20 (>30 mg mL−1 or PEG/salt concentrations (i.e. PEG400 9.4 wt.% and Na2SO4 ≥ 11.1 wt.%), these droplets look not to be as liquid, having instead the appearance of gel beads. Protein crystals initiate to nucleate and grow near the boundary of the most protein-concentrated droplets (Fig. 9), suggesting that the droplet interface participates in the nucleation mechanism60. Subsequently, crystals appear within the bulk of the droplets, which spread and coalesce. Many droplets disappear as crystal growth proceeds, suggesting that a continuous exchange exists between the droplets and the surrounding medium.

Figure 9
figure 9

Protein crystals nucleating near the boundary of the droplets (a) and crystals forming within the bulk of the droplets (b,c).

Experimental observations clearly indicate the important role of LLPS and of the established liquid-liquid interface in inducing Anti-CD20 nucleation, even in absence of underlying net attractive interaction between proteins. It is likely that only protein molecules distributed at the droplets interface experience local short-range attractive interactions, due to the specific environment and/or conformational changes, not influencing the mean B22 value.

On the basis of our results, it could be hypothesized that the occurrence of LLPS in crystallizing/precipitating conditions results in a partially unfolded protein conformation or in a suitable variation of the secondary/tertiary protein structure, as already reported in other cases27, which can be the species prone to aggregation through a mechanism driven by conformational change rather than colloidal interactions, and which initiate the crystallization process. The conformationally perturbed protein can be a minor species, which does not alter significantly the predominant native protein population and only slightly contributes to the net protein–protein interaction value measured by B22. Indeed, SLS is an integrative technique, which provides only averaged information. If only specific short-range orientations or aggregation-prone structurally perturbed states can lead to attractive interactions, the remaining repulsive orientations or species dominate in the light scattering results.

It is important to note that the measurements of intermolecular interactions were performed under low protein concentration conditions (≤5 mg mL−1), while crystallization trials are generally conducted at higher concentrations (≥20 mg mL−1). Both B22 and kd determined in dilute conditions are considered generally predictive of high-concentration behaviour. Nonetheless, high concentration behaviour can differ from that observed at dilute conditions, since “crowding” effects, i.e. the increased chance of molecular interactions, deviation from thermodynamic ideality and higher-order interactions (e.g., B23, B222) can significantly alter the net inter-protein interactions55,61. The contribution of these effects to aggregation propensity of structurally complex molecules such as mAbs, cannot be ruled out.

Conclusions

This work reports about the study of Anti-CD20 (a full-length monoclonal antibody) crystallization in a PEG400/Na2SO4/H2O system occurring near LLPS conditions. The DLS/SLS studies performed to verify the correlation between the second virial coefficient and molecular diffusivity with the protein aggregation propensity in selected solution formulations indicates that B22 remains strongly positive (a sign of repulsive protein–protein interactions) for all conditions that have been instead ascertained to be effective for protein crystallization or precipitation. For all the crystallizing or precipitating conditions, the occurrence of LLPS phenomena was observed concomitantly or near the nucleation stage. Even in this case, the second virial coefficient does not correlate with the LLPS, which mainly proceeds without attractive protein interactions, and that may be rather related to the polymer-salt phase separation that can occur without protein. The possible interplay between the salt/PEG and protein/PEG phase separation, which can affect the solvent quality, can have a role on the polymer mediated forces and finally on Anti-CD20 aggregation. On the basis of these measurements and the experimental observation of LLPS near the nucleation zone, it is proposed that a minor population of protein molecules can experience valuable protein-protein attractive interactions, induced by local environmental factor, slightly affecting the averaged B22 parameter.