Introduction

Reverse engineering biological systems requires mathematical tools to infer interactions and mechanisms of a given process to enable forward engineering applications1,2. For instance, a detailed understanding of the pathways involved during a wound-healing process can help design new treatment plans or drugs3,4. The calibration of computational models of biological systems is challenging due to the large number of interactions whose mathematical description encompasses a very large parameter space. Inevitably, there exists a tradeoff between computational speed and level of detail. For instance, subcellular element models of epithelial morphogenesis can involve hundreds of parameters with an average computational time on the order of days5. These models recapitulate a wide variety of biological processes across multiple model organisms with increasing levels of detail. However, before any model can make new predictions, it must be calibrated and validated against experimental data.

On the other hand, in-vivo biological experiments also are costly and time-consuming. The number of features or variables that can be measured in the lab is often limiting. A wide range of forces that often have nonlinear formulations jointly contribute to the shape of an organ. Identifying parameter space that defines similar tissue shapes is the first step in understanding morphogenetic robustness. Moreover, an ability to parametrize a shape also can allow for a more robust comparison between the control, wild-type condition, and a mutant organ shape yielding physical insights (Fig. 1). For example, a similar methodology of parametrizing signaling data demonstrates that projecting raw Ca2+ signatures from single cells into a more meaningful parameter space of a physics-based model using Approximate Bayesian Computations led to the discovery of four distinct cellular states6.

Fig. 1: Initial model formulation recapitulates the morphological features of the wing disc.
figure 1

a Apical view of a z-projection of a Drosophila 3rd instar wing imaginal disc. b Cross section of the tissue along the anterior-posterior (AP) axis offset from the dorsal-ventral (D, V) boundary. c The initial geometry used for Surface Evolver simulations. d Definition of subcellular cytoskeletal interactions used to define the system’s total energy. e Minimum energy configuration obtained after optimization for parameters in Table 3. Note: The loss function \({\mathscr{L}}\) measures the error between the experimental shape \({S}_{i}\) and the predicted shape \(\bar{S}(\theta ,x)\) from the model.

A typical modeling workflow includes the following steps7. The overall complex process is conceptually decomposed into a subset of biological processes. These individual subprocesses are first calibrated using a subset of the measurable experimental data before moving towards calibration of the entire process8. However, the many interactions between subprocesses often prevent identifying the global optimum. Also, choosing the best error function for comparing experimental data with model output is challenging. Methods of calibrating computational biology models include nonlinear least squares regression9, maximum likelihood estimation (MLE), maximum a posteriori (MAP) estimation10, Markov chain Monte Carlo (MCMC)11, and genetic algorithms12, among others13,14,15,16.

Each of these algorithms has advantages and disadvantages. While least squares and MLE parameter estimation methods typically exhibit fast local convergence, they often get stuck at a local minimum using gradient-based optimization methods. MCMC avoids these (sometimes poor) locally optimal parameter point estimates by inferring a posterior distribution, which often requires at least an order of magnitude more computational effort. For instance, MCMC has been used to estimate parameters of ordinary differential equation-based models in systems biology11. Another approach followed is Sequential Monte Carlo Approximate Bayesian Computations (SMC-ABC), where rejection sampling is used to estimate the posterior based on a prior and has been used previously to estimate parameters of models of tissue growth17 and Ca2+ signaling6. However, such an approach is computationally expensive as it requires calculating a distribution and may not be feasible for calibrating computationally expensive modeling frameworks like a molecular dynamic or a subcellular element (SCE) model. Moreover, these detailed modeling approaches require calibration based on multiple measured variables within the lab setup. Consequently, developing new or hybrid approaches that leverage the strengths of multiple approaches can lead to more efficient and robust computational methods. As such, new computationally efficient and robust approaches are needed to calibrate complex mechanistic (biological) mathematical models.

In this work, we present a computational pipeline employing Bayesian Optimization18,19,20 (BO) to infer the primary biophysical mechanisms driving the shape of an organ. We utilize the Drosophila wing imaginal disc cross-section shape (\({S}_{i}\)) for inferring the parameters (\(\varTheta\)) of a biophysical model (\(\bar{S}\)) of the wing disc cross-section (Fig. 1). In general, the framework allows projection of the shape of an organ to a more meaningful parameter space describing biophysical mechanisms driving organ shape generation and maintenance. In this work, we used Surface Evolver21 for simulating the wing disc cross-section. Compared to more detailed biological models of wing disc morphogenesis, such as our collaborative SCE model22, the simplified model generated by Surface Evolver provides a testing platform to assess the utility of the approach toward model calibration of multicellular systems.

To increase the computational efficiency, the framework couples the mechanistic model to Gaussian process regression (GPR) surrogate models to map the model parameters to the quantitative objective functions while considering uncertainty in model prediction (Fig. 2). The surrogate model is then used to sample new points based on an acquisition function that guides the sampling of an optimal solution. While the uncertainty with GPR models can be significant, they are routinely used for real-time control and automation23. A critical theme of the predictive control literature in the past few decades is that a mediocre model is still valid due to the power of feedback. GPRs are especially attractive because they are non-parametric models that “learn” as new data are incorporated. This also prevents the model from getting trapped in a local minimum. BO is well suited to problems with a large parameter space and easily handles constraints. As such, BO has been applied in a wide range of research areas24 including parameter estimation for computationally expensive scientific models25. Moreover, several prior studies have demonstrated GPR surrogate models are computationally efficient emulators of biological processes16, including post-transcriptional regulation in Drosophila26, dynamics of microbial systems27, cancer tumor growth28, and biopharmaceutical manufacturing29.

Fig. 2: Selection of the optimal quality of fit metric and input parameters for surrogate model training dataset.
figure 2

a Evaluating similarity measures for computing the objective function. b, c Curvature (κ) along the basal surface of the columnar cells is plotted as green lines. The basal surface is divided into three equal regions (Anterior-A, Central-C, Posterior-P). The average basal curvature is plotted in red for the subregions. d A scatter plot visualizes 50 points sampled by varying model parameters \({k}_{B}^{{col}}\) and \({K}_{{ECM}}\) using LHS. Points are color-coded based on the averaged central curvature (\({\kappa }_{C}\)) of the model output. e Heatmap showing a sensitivity measure represented by the color bar legend. The vertical axis represents the input parameter values of the physics-based model, while the horizontal axis represents different morphological features extracted from an output shape. f A representative AP cross-section of early 3rd instar wing imaginal disc (~84–90 h AEL). g Different shapes generated using the parameter sampling are arranged as per increasing Fréchet errors with respect to the representative experimental cross-section.

We first benchmark the pipeline using a synthetic tissue shape with known model parameters (Fig. 3). Post benchmarking, we employ the pipeline to predict the morphology of the wing disc undergoing degradation of the extracellular matrix (ECM). To do so, we utilized confocal microscopy data from our earlier investigation30 (Fig. 4). Lastly, the computational framework, along with fixed tissue imaging data, demonstrated that Piezo, a mechanosensitive ion channel, impacts fold formation within the Drosophila wing imaginal disc through the regulation of actomyosin contractility and cell volume (Fig. 5). This paves a direction to discover new mechanisms of mechanotransduction in cytoskeletal regulation during organ growth and morphogenesis. The contributions of this work in the domain of systems biology include (1) A successful application of BO to biophysical models of tissue morphogenesis, (2) a demonstration of Fréchet distance as a useful error metric for calibration of model parameters to define organ shape, and (3) A study highlighting the role of Piezo mediated mechanosensation in fold formation within Drosophila wing imaginal disc. Overall, this work provides an efficient pipeline to infer biophysical mechanisms of morphogenesis using morphological data of organs and tissue and identifies a key role of Piezo in regulating tissue shape during wing disc development.

Fig. 3: Bayesian optimization using a Gaussian Process regression emulator model enables efficient parameter estimation based on the tissue cross-section shape.
figure 3

a A schematic for the Bayesian optimization framework. b A scatter plot of the error predicted by the GPR model against the true error values. The shaded green region represents the lower and upper bound in prediction while the red solid line is a line of parity. c Plot showing the best error predicted so far with the number of iterations for three different exploration parameter values. d Kernel density plots of the error sampled during the BO for three different exploration parameter values. e A parallel coordinate plot showing the best parameters identified by the BO framework. The plot displays each data point as a line spanning multiple parallel vertical lines. Each vertical axis in the plot represents the estimated Surface Evolver parameter in log scale. The position of the line on each y-axis corresponds to the value of the parameter represented by the particular y-axis. Different colors represent different \(\zeta\) values and have been included as a legend. The last vertical axis represents the FB with respect to the target shape. f Eigendecomposition analysis of the local Hessian of FB. Each row of the heatmap corresponds to the eigenvector of the Hessian matrix. The rows are arranged in decreasing order of the corresponding eigenvalues represented by a bar plot on the left-hand side of the heatmap. g Tissue geometry for the synthetic target shape and the best shape obtained by the BO framework. The green contour on top of the best shape predicted represents the basal contour of the synthetic target shape.

Fig. 4: Bayesian Optimization successfully recapitulates the experimental perturbation using only the experimental output shape.
figure 4

a, b DV cross-section of wing discs before and after treatment with Collagenase. c, d Best shapes predicted by the BO framework in response to the experimental data. e A parallel coordinate plot showing the best parameters identified by the BO framework. Different colors represent control and mutant samples and have been included as a legend. Each vertical axis represents one of the parameters represented in a log scale. The last vertical axis represents the FB with respect to the target shape. f Box plot showing the variation of parameters for the best shapes predicted by the BO framework. g Box plot showing the variation of parameters for the two clusters of parameter sets recapitulating the control “wildtype” shape. The boxes indicate the interquartile range (25th to 75th percentiles), the line inside the box marks for the median, and lines extending from the box represents the range from the 10th to the 90th percentiles. h, i Expression of pMyoII in the AP cross sections of discs expressing en>mysRNAi (n = 5).

Fig. 5: Piezo regulates cell height and fold formation in the Drosophila wing imaginal disc.
figure 5

An ap-Gal4 driver was used to downregulate and overexpress Piezo in the dorsal compartment of the wing disc using UAS-PiezoRNAi and UAS-Piezo transgenic fly lines, respectively. a AP cross section of the Oregon-R disc has been used as a global control for comparison. Further, the ventral compartments are also interpreted as an “internal” control, given that the perturbation does not explicitly change Piezo expression levels in that compartment. However, some morphological changes are evident compared to the wild-type strain. It should be noted that the variation of parameters is lower for Oregon-R. Additionally, since tissue mechanics is interconnected, the comparison of the ventral compartment with Oregon-R should not be made directly, as perturbations in one compartment can influence the biomechanics and morphology of the other (b) The best shape prediction corresponding to the control data. c, e AP cross sections along the ventral (V, control) and dorsal (D, mutant) compartments of the disc expressing ap>PiezoRNAi. Fluorescent labels are indicated within the plot (n = 8). d, f Best shapes predicted by the BO framework in response to the experimental data in A-i and A-ii. gj A similar analysis as A for ap>Piezo. kn Box plot showing the variation of parameters (x-axis) for the best shapes predicted. Individual data points have been scattered over the boxes. or Expression of pMyoII in the DV cross sections of mutant discs. The yellow dashed line represents the approximate dorsal-ventral compartment boundary. Yellow arrows indicate changes in pMyoII and fold formation within the dorsal pouch compartment. At the same time, the magenta arrow suggests changes in the hinge and notum portion of the wing imaginal disc. (e: n = 7, f: n = 8) (sv) Piezo regulates ECM elasticity by controlling MMP1 and Collagen IV expression levels. This is demonstrated by the expression levels of (g) Collagen IV (n = 7) and (h) Matrix metalloproteinases 1 (MMP1) (n = 8) in the DV cross sections of Piezo knockdown discs. All scale bar = 50 μm.

Results

Formulation of a flexible mechanistic model that recapitulates the morphology of the wing disc cross-section

We first utilized Surface Evolver21 to formulate a model of the anterior-posterior (AP) cross-section of the Drosophila wing imaginal disc (Fig. 1a, b). This approach enabled us to systematically test the utility of benchmarking the Bayesian optimization framework of nonlinear model calibration. The Drosophila wing imaginal disc is an established model system for studying the calibration of models of epithelial morphogenesis31,32. At later stages of development, the wing disc consists of a fluid-filled sac with a lumen surrounded by epithelial cells of different subtypes (squamous, cuboidal, and columnar) (Fig. 1c). A thin extracellular matrix (ECM) encloses the basal surface of cells33. The central oval-shaped region termed the pouch resembles a dome-like structure. The shape of the pouch along each direction is patterned by highly conserved morphogens30,34,35,36. The geometrical attributes of wing imaginal disc on a cellular basis was estimated based on a literature review33,37 (Table 1), and cell lengths were normalized to avoid numerical instabilities within the model.

Table 1 Geometrical attributes of a wing imaginal disc

A set of energy functions were defined to incorporate the contributions of known cytoskeletal regulators in the wing imaginal disc (Fig. 1d, Table 2). For instance, phosphorylated non-muscle Myosin II (pMyoII) generates contractile forces by pulling on actin cytoskeleton filaments38. This contractility drives shape changes within the tissue that include fold formation39,40. In our model, we assumed each cell edge was a Hookean spring with a natural length, \({l}_{0}\), and a spring constant, \(k\). Conceptualizing cell lengths as a spring enables the modeling of length changes. Higher cell contractility increases stiffness and causes resistance to size variations, whereas lower contractility enables more deformations. The energy is calculated using Hooke’s law. Energy is also defined for each individual cell and lumen to penalize changes in volume. The target volume of each component is defined, and Surface Evolver calculates the product of a user-defined constant pressure and any changes in the target volume to estimate the volume energy due to relative compressibility. Apical and lateral adhesion of cells is primarily mediated by apical localization of E-Cadherin and is modeled through the definition of lateral tension41. Further, the basal surface of the tissue adheres to the extracellular matrix (ECM) via Integrin adhesion molecules42,43. This adhesion is modeled as an additional tension residing in the basal cell edges. Lastly, the ECM is modeled as an elastic string where the energy is evaluated as the integral of the squared curvature over the length of the string. The formulation is similar to one adopted by Storgel et al.44. Additionally, the basal contractility (\({k}_{B}\)) is defined as the sum of actomyosin-mediated contractility at the basal surface (\({k}_{{con}}\)) and the ECM (\({k}_{{ECM}}\)). All the energy functions defined are available as sub-routines to be called within the Surface Evolver environment. We also define a customized repulsion module to stop the apical edges from crossing each other. After every specified number of iterations (50), the repulsion subroutine is called that tracks the distance of each node in the apical surface with the center of all the other apical edges. Suppose the measured distance is less than a particular threshold. In that case, the vertices are shifted away by a minimal distance in the opposite direction normal to the line joining the vertex and the center of the edge.

Table 2 Formulation of energy terms included in the Surface Evolver simulations

Surface Evolver minimizes the system’s total energy under shape constraints to obtain a minimal energy configuration21. As a qualitative validation, we chose model parameters based on known experimental constraints (Table 3). Over time, the system’s total energy decreases and then converges to a stable minimum value (Supplementary Fig. 1). This minimum energy configuration qualitatively resembles the experimental cross-section of the wing imaginal disc (Fig. 1e). The total computational time for convergence of a typical simulation is around 40 minutes using a desktop workstation running Ubuntu 20.04 2 LTS with an Intel® Xeon(R) CPU E5-1603 v3 @ 2.80 GHz ×4 processor and 16 GB of RAM. This is ideal for benchmarking a nonlinear model calibration framework as the model provides sufficient detail to capture salient features of the cross-sectional shape but is not too computationally expensive to preclude analysis of the overall pipeline. In the next sections, we describe the sensitivity of model parameters and a framework to compare the error between the model-generated shape with the experimental cross-section.

Table 3 Model parameters: the italicized parameters are to be defined within the function that writes the initial geometry file

Defining the input-output data for the surrogate model

Fréchet error is the best metric for comparisons between tissue shapes

Here, we first describe a methodology of comparing the outer (basal) contours of any two-wing disc cross sections. The input to a model of the wing disc cross section is the set of parameters (\(\theta\)) representing cytoskeletal regulation (Fig. 1d). For model calibration, an objective function comparing the target experimental shape (\({S}_{i}\)) and Surface Evolver generated cross-section (\(\bar{S}(\theta ,{x})\) where: \(x\) represents the initial geometry) needs to be defined (Fig. 1). Elliptic Fourier Descriptors (EFD)45 are used to normalize the basal surface of epithelia against the size and translation given the experimental data and the simulated cross sections are of arbitrary length scales.

To select the best similarity measure for comparing two contours in general, we employed the following strategy: A random simulation was selected from the parameter screening data, and an error was computed based on several similarity measures with respect to the other cross-sections generated within the screen. In particular, we tested the following metrics for comparing two cross-sectional shapes (Fig. 2b):

  • Area between curves, which is used to compute the area enclosed between any two curves.

  • Curve length measure, which is a metric that is used to compare two curves based on their total arc length.

  • Elliptic Fourier Descriptors (EFD), which are the calculated Fourier coefficients of the chain-encoded closed contour. This shape descriptor is both rotation and translation invariant. An L2-norm between the EFD coefficients of two shapes is used to quantify similarity45.

  • Partial Curve Mapping, which is a method of comparing curves through alignments of the smaller curve regions.

  • Dynamic Time Warping (DTW), which compares similarities between two signals. It works by distorting one of the signals to maximize its alignment with the other signal with which it is compared46.

  • Fréchet distance, which is defined as the shortest distance between any two curves given one, is allowed to traverse along the two curves with different speeds47.

The best shapes identified using each of the similarity measures listed above for the randomly selected target shape are plotted in Fig. 2a. Qualitatively, Fréchet distance47 (\({F}_{B}\)) performed the best in identifying the basal contour most closely approximating the target shape. One advantage of Fréchet distance is that the measure in itself is an error quantification. This simplifies data-driven modeling as it also serves as the objective function. We also report average apical (\({L}_{A}\)), basal (\({L}_{B}\)), and lateral (\({L}_{L}\)) lengths of each cell subtype along with the average basal curvature for the anterior (\({\kappa }_{{Anterior}}\)), medial (\({\kappa }_{{Medial}}\)), and the posterior (\({\kappa }_{{Posterior}}\)) halves of the wing disc (Fig. 2b).

Parameter sensitivity analysis identifies cell contractility as a key regulator of tissue shape

A sensitivity analysis was carried out to study the effect of varying \(\theta\) on overall tissue shape (\(\bar{S}(\theta ,{x})\)) (Fig. 2c, Supplementary Fig. 2). Each parameter \(\theta\) was increased and decreased by 70% of its original value. Morphological features within \(\bar{S}(\theta ,{x})\) were measured, and a central finite difference scheme was used to compute the sensitivity of model parameters. As expected, changing the natural lengths of apical, basal, or lateral edges of the squamous, cuboidal, and columnar cells (\({L}_{0,(A,{B},{L})}^{{squ},{cub},{col}}\)) caused the most changes across any of the measured features. A loss in the natural length of the cell often represents the cell’s failure to regulate actin polymerization and depolymerization. Dysregulation of actin polymerization causes severe morphological defects48.

A change in the spring constants of either of the apical, basal, and lateral edges of a columnar cell (\({k}_{A}^{{col}},{k}_{B}^{{col}},{k}_{L}^{{col}}\)) led to changes in the shape of the tissue. It should be noted that the basal contractility in our model is a sum total of contractility generated by the basal actomyosin complex and the ECM. This agrees with our previous work, where we used a more detailed subcellular element model (SCE model) of a wing imaginal disc to show that the tissue shape is mainly generated by actomyosin contractility and maintained by the ECM30. We also found that varying the contractility of squamous cells did not impact overall tissue shape (Fig. 2c). This observation confirms a recent report where altering contractility and growth through downregulation of PI3K in the squamous epithelia did not significantly impact the shape of wing imaginal disc33. Besides highlighting the significance of basal cell contractility and cellular geometrical properties, our study emphasizes the pivotal role of extracellular matrix (ECM) bending rigidity (\({K}_{{ECM}}\)). Through simulations, we demonstrate that increasing \({K}_{{ECM}}\) can effectively counteract the impact of basal contractility (\({k}_{B}^{{col}}\)) on basal tissue curvature, thus contributing to the maintenance of tissue shape (Fig. 2b). We sampled 50 points in the parameter space by varying \({k}_{B}^{{col}}\) and \({K}_{{ECM}}\) as shown in Fig. 2c, d. We next computed minimal energy configurations and measured the average central curvature (\({\kappa }_{C}\)) of the output shapes. Notably, an increase in \({K}_{{ECM}}\) led to a reduction of \({\kappa }_{C}\). Additionally, we observed an almost linear trend between \({K}_{{ECM}}\) and \({\kappa }_{C}\) at higher limits of \({K}_{{ECM}}\) suggesting that energy due to ECM may dominate that of actomyosin complex generated contractility. Further experimentations are required to validate these findings.

Gaussian Process Regression surrogate model

To further benchmark the calibration of the nonlinear model, we selected a small subset of the parameters identified through sensitivity analysis. The selected parameters are highlighted in bold along the vertical axis of the heatmap within Fig. 2e. Latin Hypercube Sampling was used to sample this reduced parameter space uniformly49. Surface Evolver was run for all the sampled points, and the geometrical features were extracted. The sampled parameters and the corresponding morphological features constitute the input-output data for a surrogate model to be used for the optimization task. As an example, the sampled cross-sections were arranged based on an increasing Fréchet distance (top to bottom) (Fig. 2g) with respect to the target experimental cross-section \({S}_{i}\) (Fig. 2f). A low Fréchet error corresponds to a better approximation of the experimental data.

Our work used Gaussian Process Regression (GPR) as a surrogate model50,51,52. GPRs, also known as Kriging models, have a rich history in surrogate-assisted optimization53, especially for problems with ten to fifty degrees of freedom. We employed a leave-one-out strategy for training where a model was trained iteratively, leaving exactly one data point out and using everything else for training52. The remaining data points were then used for the assessment of the model. With this strategy, we found that the model predictions are in good accordance with the true output.

Based on this result, we utilized all the data points within the parameter screening for the initial training of the GPR model (Fig. 3b). In the following sections, we used the described GPR model to develop a framework for Bayesian optimization of the physics-based model of wing disc morphogenesis.

Bayesian optimization of the Gaussian Process Regression (GPR) model of tissue shape enables parameter estimation of the physics-based models

The Bayesian optimization (BO) approach utilizes and updates a prior belief between the inputs and the outputs used for the calculation of the objective function in the form of surrogate models. The surrogate GPR model approximates the functional values for input as a Gaussian distribution allowing quantification of uncertainty in the form of covariances. This is crucial for the computation of acquisition functions, described later in the text, which is used to sample new points within the parameter space, as shown in Fig. 3a.

To benchmark the pipeline of parameter estimation, we generated a synthetic target shape with known parameters, which was not included in the training data. For training a GPR model, Fréchet distance \({(F}_{B}\)) is first computed for all the samples in the screening dataset with respect to the synthetic target shape. The parameters and the corresponding negative value of Fréchet distances (\(-{F}_{B}\)) act as input and output to our GPR model. BO uses an acquisition function guided by the GPR model to draw a new parameter set that maximizes the response value of the GPR. Since we chose \(-{F}_{B}\) as our response variable, maximizing it corresponds to finding \(\varTheta\) leading to the shape closest to the selected target. We use Expected Improvement as our acquisition function. This allows the sampling of new points around the region that maximizes the output of the surrogate model. The exploration parameter (ζ) within the Expected Improvement defines the amount of exploration during the sampling process. Higher exploration parameters tend to sample points from the regions where the GPR model uncertainty is low instead of sampling guided mainly by the mean value of the surrogate model. The pseudo-code for full pipeline is provided as Algorithm 1. This pipeline was implemented in Python (v 3.8.13) using Surface Evolver (v 2.40), Surrogate Modeling Toolbox (v 1.0.0), GPyTorch (v 1.5.0) and scikit-learn (v 0.24.2).

Three different values of the exploration parameter between 0 and 0.05 were selected to run the BO framework for a synthetic tissue cross-section whose parameter values were already known. We plotted the distribution of Fréchet errors for parameters sampled by the BO framework for the different ζ values. A kernel density was calculated to approximate the distributions (Fig. 3d). The plots reveal that increasing ζ tends to flatten out the distributions of errors between the tissue shapes generated by the sampled parameter and the synthetic target shape. This also suggests that the newly sampled points are farther away from the mean of the distribution as compared to points sampled using lower ζ values.

Algorithm 1

Bayesian optimization using GPR surrogate models.

We next plotted the best error sampled so far with each iteration in BO. The best error so far for an ith iteration is defined as the minimal Fréchet distance reported till the particular step during the parameter sampling process. With increasing values of ζ, the points sampled have a lower \({F}_{B}\) as compared to \({F}_{B}\) from the best shape in training data. An increase in ζ also allows for a faster convergence (Fig. 3c). For each ζ, we also plotted the best parameter values that lead to an approximation of the target shape closer or lesser to the best error in the training data. Each vertical axis of the parallel coordinate plot represents the log of parameter values indicated in the horizontal axis label. The dashed black lines represent the true parameter values of the synthetic target shape. The parameters of the best shapes extracted using the pipeline are in good accordance with the synthetic target parameters (Fig. 3e, g).

We next analyzed the curvature (Hessian) of the objective function, i.e., FB, in the vicinity of the parameters of the target shape to assess the parameters that can be approximated using FB alone (Fig. 3f). The eigenvectors corresponding to the two largest eigenvalues predominantly point in the direction of parameters \({k}_{L}^{{col}}\) and \({K}_{{ECM}}\), respectively. This means that the predicted organ shape, as determined by \({F}_{B}\), is most sensitive to these parameters. Thus, \({k}_{L}^{{col}}\) and \({K}_{{ECM}}\) can be estimated with the least uncertainty. Previous studies demonstrate that the elasticity of ECM (modeled as \({K}_{{ECM}}\)) plays a significant role in determining bulk organ shape22,33. A change in lateral contractility (modeled as \({k}_{L}^{{col}}\)) is also known to cause severe morphogenetic defects as shown by the shape of the cross-section22.

The eigenvectors of the third and fourth largest eigenvalues predominantly are in the direction of \({k}_{B}^{{col}}\) and \({L}_{0,{L}}^{{col}}\), respectively, which means these parameters can be inferred with moderate uncertainty. Finally, the three smallest eigenvalues are near zero, and their eigenvectors predominately point in the direction of \({L}_{0,{A}}^{{col}}\), \({k}_{A}^{{col}}\), and \({L}_{0,{B}}^{{col}}\). Thus, these parameters are (near) non-estimable using only FB, and the corresponding parameter estimates are highly uncertain.

A parallel coordinate plot showing the best parameters identified by the BO framework. Different colors represent different values and have been included as a legend.

Previous studies have highlighted the role of basal contractility (modeled as \({k}_{B}^{{col}}\)) in generating and maintaining the dome shape of wing imaginal disc30. A genetic loss of apical-basal contractility through the expression of the dominant negative form of Rho, an upstream regulator of actomyosin contractility, causes the tissue to flatten out22. However, a loss of actomyosin contractility upon pharmacological treatment with ROCK inhibitors did not cause severe changes in the shape of a late development stage 3rd instar wing imaginal disc30. Genetic perturbation, achieved by expressing RhoDN, was performed in the earlier stages of morphogenesis (2nd instar larval stage). In contrast, pharmacological perturbations utilizing ROCK inhibitors were administered during the later larval stage (early 3rd instar larval stage). This suggests that once the wing disc acquires its bent shape, it is less sensitive to changes made in apical-basal actomyosin contractility. Measurements of more variables like local cell lengths can also help better approximate these parameters, which is an open avenue for future investigations.

Through these steps, we conclude that our pipeline employing BO can successfully recover a subset of parameters of the computational model from the representation of the basal surface alone. The proposed methodology allows the transformation of the shape of an organ into a more meaningful parameter space, where each parameter in this space indicates distinct cytoskeletal regulators of wing disc morphogenesis. In the following sections, we describe how to transform mutant organ shapes into the parameter space to identify changes in cytoskeletal regulation that cause changes in the overall tissue shape. Such a pipeline can serve to identify new functions for specific genes and gene products.

Bayesian optimization framework predicts loss in contractility of columnar cells upon removal of ECM

As a first test of the framework, enzymatic degradation of the extracellular matrix (ECM) with Collagenase30 was performed to test if the BO framework can recapitulate specific perturbations to cell mechanics, in agreement with our earlier work, the removal of ECM led to a striking inversion or flipping of the curvature within the tissue (Fig. 4a, b) and a loss of inwards bending towards the basal surface of the columnar cell was observed. External contours along the basal surface of the experimental cross-sections are fed into the BO framework as inputs. Our pipeline next estimates the parameters of the Surface Evolver model that would best represent the shape changes. BO was carried out to minimize the Fréchet distance between the experimental cross-section and the simulated shape. The pipeline successfully captured qualitative shape changes as observed within the experimental data, as shown in Fig. 4c, d. Significantly, the estimated parameters using the basal contours of the wing disc shape after collagenase treatment indicate a significant decrease in contractility compared to the control (Fig. 4e). All apical, basal, and lateral contractility (\({k}_{A}^{{col}},{k}_{B}^{{col}},{k}_{L}^{{col}}\)) levels decreased (Fig. 4f).

The basal contractility (\({k}_{B}^{{col}}\)) in our model is the sum of contractility imposed by the actomyosin complex (\({k}_{{con}}^{{col}}\)) and the extracellular matrix contractility (\({k}_{{con}}^{{ECM}}\)). Modeling these two independently within the Surface Evolver would increase the model complexity. No significant variations were found in the KECM parameter, a multiplier of the curvature-based energy used to describe the ECM. The parameter \({K}_{{ECM}}\) contributes towards the elastic energy of the ECM. A higher \({K}_{{ECM}}\) penalizes for higher elastic energy, restricting the wing disc from folding, while a lower \({K}_{{ECM}}\) allows wing imaginal disc tissue to make dramatic curvature changes.

We also examined the curvature of the objective function (\({F}_{B}\)) around the average parameter value obtained from the set of sampled parameters generated by the BO framework (Supplementary Table 1) to assess the identifiability of the model parameters. The Surface Evolver output of the average parameter value yielded a shape similar to one observed when ECM is degraded by collagenase (Supplementary Fig. 3A). Notably, the eigenvector corresponding to the dominant eigenvalue primarily aligns with the parameter \({k}_{L}^{{col}}\) indicating that the predicted organ shape, as described by \({F}_{B}\), is most sensitive to variations in the specific parameter (Supplementary Fig. 3B). Consequently, the estimation of \({k}_{L}^{{col}}\) can be carried out with minimal uncertainty. Conversely, the three subsequent eigenvalues exhibit significantly smaller magnitudes than the first one, and their predominant directions align with \({K}_{{ECM}}\), \({k}_{B}^{{col}}\) and \({l}_{0,L}^{{col}}\) respectively. This observation suggests that these parameters can be inferred with moderate uncertainty. Our framework suggests that tissue can still regulate its shape even with varying elasticity by modulating other subcellular features like cell edge lengths and actomyosin contractility. These predictions align with previous findings where a subcellular element model recapitulated the changes upon ECM removal by removing basal contractility and the ECM stiffness from the model30.

The control group in our pipeline resulted in two sets (Cluster 1, Cluster 2) of parameters that produced similar tissue shapes (Fig. 4c). Examination of the parameters reveals that the differences in shape were due to differences in columnar cell height \({({Lo}}_{{L}}^{{col}}\)) (Fig. 4e). Cluster 1 corresponds to tissue shapes with lower lateral cell height but higher apical contractility. In contrast, Cluster 2 corresponds to a higher lateral cell height with higher basal contractility (Supplementary Table 1). This suggests that maintaining the inwards doming shape with stretching of the pouch cells during growth requires increased basal contractility. It can be achieved either through increasing the ECM stiffness or increasing actomyosin-mediated contractility. Our previous work demonstrated that both ECM stiffness and basal contractility play a crucial role in generating and maintaining the dome shape of the Drosophila wing imaginal disc30. Other recent work shows increased ECM stiffness as the Drosophila wing disc grows in size33. Further, the sum of apical and basal contractility is also higher for Cluster 2 than for Cluster 1. Cluster 2 also exhibits cells with lower apical cell area. As the pouch grows in size, the apical cell area decreases in size37. The reduction in this apical cell area also contributes to the increase in cell height. We also compared the ratio of columnar cell height (\({L}_{{Lateral}}\)) to apical cell diameter (\({D}_{{Apical}}\)), represented by the wing disc model parameters \({L}_{{col}}^{{lat}}\) and \({L}_{{col}}^{{api}}\) respectively. Our analysis, in line with Breen et al.’s54 findings, demonstrates a comparable ratio (Literature: \({L}_{{Lateral}}\) = 21.5 μm, \({D}_{{Apical}}\) = 3.49 μm, \({L}_{{Lateral}}\) / \({D}_{{Apical}}\) = 6.16, Model Predictions: \({L}_{{col}}^{{lat}}\) / \({L}_{{col}}^{{api}}\) = 10.03). Furthermore, we illustrate that tension in the basal columnar epithelium exceeds that in the apical columnar epithelium (\({k}_{{col}}^{{bas}}\) > \({k}_{{col}}^{{api}},{k}_{{col}}^{{lat}}\)) in agreement with Sui et al. 55.

Our framework also predicts a loss of contractility (\({k}_{{col}}^{{api},{bas},{lat}}\)) within the tissue upon Collagenase mediated degradation of the Extracellular Matrix (ECM) (Fig. 4e, f). To test if inhibition of cell-ECM adhesion as a result of ECM degradation, we downregulated mys, a subunit of β-integrin, within the posterior compartment of the wing imaginal disc (Fig. 4h). β-integrin plays a crucial role in cell-ECM adhesion and transduction of extracellular signals into the cells42,56. Subsequently, we performed an immunohistochemistry assay using a phosphorylated non-muscle myosin (pMyoII) antibody to study the impact of the loss of cell-ECM adhesion on pMyoII expression. pMyoII, along with Actin, regulates the generation of contractile forces within the tissue30. Our findings indicate that the loss of β-integrin through en>mysRNAi results in decreased basal pMyoII levels in the posterior compartment (Fig. 4h, i). Furthermore, the knockdown of β-integrin qualitatively reproduces a similar phenotypic profile, characterized by the generation of folds in the opposite direction compared to the wild type.

In summary, by applying BO, we show that enzymatic degradation of ECM causes a reduction in global tissue actomyosin contractility. It also causes an inversion in the tissue curvature. Interestingly, our pipeline also reveals two distinct mechanisms of cytoskeletal regulation leading to similar tissue shapes defined by its outer surface. Analysis of the parameters of the two groups further describes biophysical mechanisms driving the thickening of the columnar cells as the tissue grows in size. Our model predicts that an increase in basal contractility and apical cell area is required to sustain the bent shape of the wing imaginal disc as it grows in size.

Piezo regulates basal epithelial curvature through actomyosin contractility and ECM elasticity

Next, we explored the utility of the BO framework for inferring mechanisms that define new morphological shapes downstream of specific genetic perturbations (Fig. 5a, b). To do this, we used the Gal4-UAS system to either knockdown or over-express Piezo in the dorsal compartment of the wing imaginal disc using an apterous-Gal4 driver. Piezo proteins are a class of mechanosensitive ion channels involved in regulating multiple biophysical processes57,58. However, its role in regulating overall organ shape is poorly understood. Cross-sections parallel to the AP axis were taken in the dorsal side of the pouch guided by the apterous::GFP expressions. The cross-section in the ventral side was used as an internal control. Knockdown of Piezo caused loss of doming within the pouch (Fig. 5e) along with a reduction in curvature of folds in the notum region of the tissue compared to the control (Supplementary Fig. 4). On the other hand, overexpression of Piezo caused the tissue to increase in bending as compared to the internal control (Fig. 5i).

We used the definition of the basal surface of the contours, as shown in Fig. 5c, e, g, i), as our inputs to the BO framework. Our framework identified shapes qualitatively matching the mutant cross-section (Fig. 5d, f, h, j). A comparative analysis between the model parameters of the best cases identified revealed a decrease in both apical and lateral contractility (defined by \({k}_{A}^{{col}},{k}_{L}^{{col}}\)) upon Piezo knockdown as compared to prediction for the internal control and the global Oregon-R control. \({k}_{L}^{{col}}\) also increased upon overexpression of Piezo in a compartment-specific manner (Fig. 5k–l). As predicted through model analysis, quantifying the expression of pMyoII within the Piezo mutants also shows an increase in both apical and lateral pMyoII upon overexpression of Piezo (Fig. 5q, r, Supplementary Fig. 4). We also report a decrease in pMyoII in a compartment-specific manner upon Piezo knockdown (Fig. 5o, p, Supplementary Fig. 4).

Apart from changes in actomyosin contractility (\({k}_{i}^{{col}}\)), our model also predicts a compartment specific increase in \({{LO}}_{B}^{{col}}\) upon a knockdown of Piezo (Fig. 5m). It further predicts a compartment-specific decrease but a global increase in \({{LO}}_{L}^{{col}}\) on Piezo overexpression as compared to PiezoRNAi-expressing discs. An increase in pMyoII levels upon Piezo overexpression can cause the lateral pouch to contract, decreasing its length (Fig. 5n, Supplementary Fig. 4). Interestingly, our pipeline also predicts an overall decrease in \({K}_{{ECM}}\) upon Piezo overexpression (Fig. 5g, i). A previous study of cell dissemination conducted in Drosophila midgut has shown that Piezo is required for the degradation of the ECM through RasV12 59. Further experiments are needed to measure changes in ECM elasticity upon Piezo overexpression. Changes in the \({K}_{{ECM}}\) suggest that there will be experimentally observable changes in the composition of the ECM, including regulation of remodeling. Furthermore, Piezo plays a crucial role in the degradation of the ECM through its interaction with matrix metalloproteinase 1 (MMP1), an endopeptidase involved in ECM remodeling, and Collagen IV, a fundamental component of the basal ECM59,60,61,62. To test the implications of this interplay in the wing imaginal disc, we evaluated the expressions of Collagen IV and MMP1 in Piezo downregulated (ap>PiezoRNAi) tissues. Consistent with the general trends predicted by the model, we observed a downregulation of MMP1 (Fig. 5s, t) and an upregulation of Collagen IV (Fig. 5u, v) for Piezo knockdown, consistent with the interpretation that Piezo regulates ECM elasticity in the wing discs.

Overall, this section used a combination of quantitative image analysis of fixed tissues and the BO platform to infer potential mechanisms of fold formation regulation by Piezo mechanosensitive ion channels. Our work highlights that Piezo regulates bending in Drosophila wing imaginal disc by regulating actomyosin contractility, ECM elasticity, or regulation of single cell volume. A more detailed experimental analysis is required to establish mechanisms of cytoskeletal regulation through Piezo.

Discussion

Morphogenesis involves dynamic shape changes within the organ until it reaches a target size and shape63,64. Changes in the global shape of an organ arise from the integration of changes at the cellular level over time. A key challenge for quantitative systems biology is to elucidate how the complex interplay of chemical signals at the single-cell level contributes to the overall organ shape. However, the large diversity of proteins that regulate shape changes and the difficulty of measuring them all at once necessitate the formulation of computational models65. A major challenge in calibrating multiscale models of morphogenesis is that they are often highly nonlinear and computationally expensive. Both conventional gradient-based optimization methods (e.g., gradient descent, quasi-Newton methods) and MCMC for Bayesian calibrations require too many evaluations for computationally expensive large-scale models, including subcellular element simulations, to be practical.

In this work, we developed and validated a computational framework, which is robust to initial conditions (Supplementary Fig. S5), to efficiently infer parameter sets with Bayesian optimization using GPR surrogate models that match experimentally obtained shapes of organs. We found that the optimization process works best using Fréchet distance as the error metric, i.e., the objective function (Fig. 2). A multi-faceted Bayesian optimization and local sensitivity analysis revealed that actomyosin contractility and extracellular matrix stiffness are the primary contributors to basal curvature and shape control (Fig. 3). This confirms previous reports that relied on experiments and scenarios of more complex subcellular element simulations30, demonstrating the robustness of the conclusions.

The current model calculates errors based on overall changes in tissue shape using distances. Often, problems in complex biology are nonlinear and sloppy, leading to multiple parameters within the model that can generate similar shapes66. Our pipeline suggests that tissues with different patterns of apical-basal contractility can generate similar shapes of basal curvature while also revealing variable regulation of columnar cell height (Fig. 3c). The identification of multiple solutions depends on the number of morphological features used in generating the optimization cost function. For example, including additional components within the objective function in the form of columnar cell height can reduce the number of solutions. The selection rationale of morphological features to include in the cost function depends on the form of experimental data and the choice of the physics-based model. In future work, expected improvement should be compared against alternative acquisition functions such as lower/upper confidence bounds, probability of improvement, and Thompson sampling, which may result in improved performance.

At this stage, our model estimates errors based on the overall geometry of the organ, as quantified by a single metric, the Fréchet error. As a limitation, it does not provide predictions on the geometry of individual cells or the shape of the internal lumen. This limitation is mitigated as features, including tissue height, can be inferred from the cross sections. Future development of this modeling framework can incorporate additional optimization criteria for matching individual cell shapes and the shape of the lumen. Further, a fully 3D modeling framework remains a natural future step. To do so, one can incorporate multi-objective optimization principles to identify the Pareto optimal trade-offs between alternate error metrics solutions23. Such enhancements will lead to greater predictive capabilities and enable deeper mechanistic insights into the cellular contributions to an organ’s shape. The computational cost will grow to achieve this, and the surrogate modeling must account for relative accuracy for a given computational cost.

The BO framework, in conjunction with immunohistochemistry assays, further reveal the role of Piezo mechanosensitive ion channels in regulating fold formation during wing imaginal disc morphogenesis (Fig. 5). It does so through a combined regulation of patterning in apical-basal contractility and cell volume. Even though significant efforts have investigated the roles of Piezo in regulating single-cell processes like proliferation, apoptosis, and cytoskeletal regulation67,68,69,70,71, emergent functions at the next, multiscale hierarchy of overall organ function remain poorly understood. Previous studies related to Piezo have shown its impact on regulating RhoA72. Further, RhoA is upstream of pMyoII, suggesting a potential mechanism. Since a knockdown of Piezo significantly decreased the accumulation of basal pMyoII (Fig. 5o, p), we also hypothesize that it may be through Integrin clustering mediated activation of mechanosensation. Increased basal contractility reduces the basal cell area, bringing integrin molecules sufficiently close to form clusters73,74. Once initiated, integrin clusters can stabilize the formation of focal adhesion complexes to increase tension further75. The final step can be the activation of Piezo to regulate Rho through Ca2+ to further promote pMyoII and consequently contractility76. This is equivalent to the autoregulation of pMyoII, which is supported by the experimental evidence that a loss of Integrin in the Drosophila wing imaginal disc is also known to reduce basal pMyoII levels22. This work thus motivates future experiments to map the exact mechanisms of regulation of pMyoII by Piezo.

Our computational pipeline also predicts a decrease in ECM elasticity (modeled as KECM) upon overexpression of Piezo (Fig. 5g, i). Previous studies have proposed the degradation of ECM through Piezo in Drosophila midgut59. In accordance with the literature, our experimental data confirms loss of Piezo reduced the levels of Mmp2, an enzyme known to degrade ECM. The work further proposed that Piezo can also contribute towards ECM degradation through Ca2+-mediated activation of Calpains. Further experimental measurements of ECM elasticity are required to confirm the predicted function of Piezo during organ development.

Future work can also further define morphogenesis as a multi-objective optimization problem based on multiple outputs of the model and corresponding measurable features in experimental data. The pipeline can be extended for organ-level drug screening, allowing for the study of new mechanical functions of genes due to genetic or pharmacological perturbations. Further, this framework can be extended to studying other physical and biochemical processes such as embryogenesis77,78 and models of plant development79. Of note, it can also be used to study any models of organogenesis as it is independent of the modeling framework or package used in the physics-based simulation, which makes it attractive for more complex computational models that incorporate subcellular elements22,30,80,81. In summary, this computational framework enables the systematic elucidation of generalizable biological rules of the morphogenesis of multicellular systems.

Methods

Experimental and image analysis methods

Fly stocks and culture

Drosophila was grown within an incubator maintained at \({25}^{0}C\). The flies were maintained on 12-hour darkness/light cycle. Virgins from the Gal4 driver colonies were collected twice daily. For the first collection, the bottles are emptied before 6 h of collection. Female virgins with Gal4 drivers were crossed with UAS-transgene male flies in a 10-15:4 ratio. Early 3rd instar wandering larvae were collected to dissect wing imaginal disc tissue. The wildtype Oregon-R fly line is a long-standing stock in our group originally acquired from the N. Yakoby lab. The following other transgenic stocks and their source, include UAS-PiezoRNAi, VDRC # 105132, and UAS-Piezo, BDRC #58772/58773.

Immunohistochemistry

Wing imaginal discs were dissected in a phosphate-buffered saline (PBS) solution before fixation in a 4% paraformaldehyde in PBS solution. Quantification included all samples in the reported sample size. Fixation was done by placing the PCR tubes containing wing disc samples in an ice bath for an hour. Post-fixation, the samples were rinsed with a fresh PBT solution (PBS with 0.03% v/v Triton X-100). Three quick rinses were followed by three 10-minute-long washes. PBT within the tubes was next replaced with 250 μL of 5% normal goat serum (NGS) in PBS and agitated for an hour. Following this, the NGS solution was replaced with the primary antibody solution, and the tubes were left in a rotating platform placed in a temperature of \({4}^{0}C\) overnight. The following primary antibodies were used: i) Phospho-Myosin Light Chain 2 (Ser19) (1:50, Rabbit, Cell Signaling Technology #3671S) ii) Integrin \(\beta\)PS (myospheroid) (1:5, Mouse, Developmental Studies Hybridoma Bank CF.6G11) iii) Anti-MMP1 (1:1000; Mouse; Developmental Studies Hybridoma Bank, 3B8D12), iv) α-Collagen IV (1:5; Rabbit; Abcam ab6586). The next day, the primary antibody was replaced with PBT. After three quick rinses and three 15-minute-long washes, the PBT in the tube is replaced with a secondary antibody solution. The following secondary antibody and dyes were used in our studies: α-Rabbit Alexa Fluor™ 647(1:500, Goat, Thermo Fisher Scientific A32733), α-Mouse Alexa Fluor™ 568 (1:500, Goat, Thermo Fisher Scientific A-11031), DAPI (1:500, Sigma Aldrich D9542) and Fluorescein Phalloidin (1:500, Thermo Fisher Scientific F432). After two hours of secondary antibody and dye incubation at room temperature, avoiding light exposure, three rinses of PBT were carried out. After two additional long washes of 15 min, the samples were left for overnight incubation and agitation in PBT at \({4}^{0}C\). The next day, the samples were mounted in a coverslip with spacers to avoid squishing the samples. Spacers were designed using two layers of bio-compatible tapes to create a well to place the samples. Vectashield mounting medium and a cover slip was placed atop, aligned with the spacers.

Confocal microscopy

Imaging of wing imaginal disc samples was done with three different microscopes: Nikon Eclipse Ti confocal microscope with a Yokogawa spinning disc, Nikon A1R-MP laser scanning confocal microscope, and Leica Stellaris 8 DIVE Point Scanning Confocal Microscope. For the two confocal microscopes, image data were collected on an IXonEM+colled CCD camera (Andor Technology, South Windsor, CT) using MetaMorph v7.7.9 software (Molecular Devices, Sunnyvale, CA), NIS-Elements software, and LAS X microscope software respectively. The step size for acquiring 3D data was kept between 0.5–1 μm, depending on sample thickness. Imaging was done using 40× and 60× oil objectives with 200 ms exposure time, and 50 nW, 405 nm, 488 nm, 561 nm, and 640 nm laser exposure.

Image analysis

All the raw data presented within the manuscript was analyzed using FIJI/ImageJ. Quantification of fluorescence intensity was carried out using an in-house MATLAB code whose details can be found in the supplementary information of the text. CSBDeep, an ImageJ plugin was used for deconvolution and denoising of the Actin channel. A rolling ball background subtraction was also used to remove background noise. QuickStitch82 was used for stitching individual tiles while imaging the entire volume of the imaginal disc.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.