Introduction

Bronchopulmonary dysplasia (BPD) is the most common long-term morbidity among premature infants, with an incidence inversely related to gestational age and birth weight.1 Currently, most definitions of BPD use a threshold level of treatment to diagnose the existence as well as severity of BPD.2 Though these clinical treatment definitions of BPD are easy to use, there is increasing effort to devise more robust ways of identifying and categorizing the disease.3 Artificial Intelligence (AI) and Machine Learning (ML) present promising avenues for exploration of this disease. As computing power has increased over the past decade, techniques in these areas have found more practical uses in healthcare. Researchers have used these techniques to shed deep insights into various diseases, but applications in the field of BPD have been limited. In this manuscript, we present an overview of challenges in defining a continuously evolving disease such as BPD and a potential role of AI in meeting this challenge.

Bronchopulmonary dysplasia: definition and challenges

BPD was initially described by Northway et al. in 1967 in a series of infants which showed characteristic findings suggestive of severe injury to developing lung following prolonged exposure to high pressure and oxygen concentration during mechanical ventilation.4 At the time, the only infants who survived were relatively mature and bigger as compared with the current state of increased survival of infants at much earlier pulmonary developmental stages. This increased survival has not only contributed to a persistently high incidence of BPD5,6 but also resulted in marked changes in the pathophysiology and clinical picture of BPD: an evolution from a disease with severe respiratory failure resulting from alveolar and airway damage (“Old BPD”) to a disease with a combination of developmental arrest and lung injury from various antenatal and postnatal factors beyond merely positive pressure and oxygen (“New BPD”).7 These changes have presented unique challenges of optimally defining a clinical disease with evolving multifactorial pathogenesis in a patient population with inherently variable risk of pulmonary dysfunction due to immaturity of the respiratory system.

The definitions and criteria for patient stratification in BPD have struggled to keep up with the advances in management and changes in the disease. An ideal set of diagnostic criteria should accurately classify infants based on the pathophysiology or organ dysfunction. These revised criteria should also create a stratification system that could inform clinical decision-making and allow for better long-term prognosis. BPD is one of the few diseases defined by the necessity of treatment rather than the pathologic picture or organ dysfunction. Most current definitions use a combination of pressure or flow with fraction of inspired oxygen at a point in time to determine severity of the disease.8,9,10 Though these formulas remain convenient and accessible ways to define BPD for epidemiological and benchmarking purposes, they have multiple shortcomings, as reflected by limited prognostic capabilities and the need for various definitions for changes in patient population and treatment practices. An ideal BPD definition will be both simple and precise and lead to prognostication as well as development of effective prevention and management strategies. The need to accurately define BPD cannot be overstated, as it continues to have significant burden not only on the patient and families but also on the healthcare system and society at large.11,12 Overall, infants with BPD have poor long-term prognosis including worse neurodevelopmental,12,13,14,15,16,17 respiratory, and cardiac outcomes lasting at least into adolescence.11,18,19 The lifelong effects of these morbidities are not fully known, as the oldest extremely low birth weight (ELBW, <1000 g) infants are just now reaching adulthood.16,20,21,22

Management of BPD is mostly focused on preventive strategies to help limit injury and promote repair mechanisms during the critical stages of lung development. Therefore, it is critical to identify these infants during their early course prior to significant lung injury. One of the reasons for limited success in developing effective preventive strategy for BPD has been the inability to identify a specific population which can benefit from the strategy. Early prediction of infants at greater risk of developing severe disease would not only help in developing preventive strategies but also aid in early management and treatment of individual patients. There have been some efforts in development of prediction tools, with the most commonly used being the 2011 online calculator by the National Institute of Child Health and Human Development (NICHD) Neonatal Research Network (NRN), which uses birth characteristics to provide a risk estimate of developing BPD.23 However, there are several limitations, including relatively small number of variables in the calculator, inapplicability to some ethnicity or gestational age, and limited relevance to more contemporary patient populations. The use of a large dataset through AI may not only help build robust predictive models that not only identify infants with BPD but also differentiate the diagnosis into its various physiological forms.

Possible avenues for AI research in BPD

Infants who develop BPD are usually hospitalized for several months, generating immense volumes of clinical data, including radiological images, lab values, ventilator settings, vital signs, growth parameters, clinical notes, operative reports, fluid balances, nursing assessments, and flowsheet values. While these provide a valuable and evolving picture of the infant over time, the sheer volume can be overwhelming and extremely time consuming to analyze manually. Developing patterns may be missed given the frequent turnover of clinicians and the impracticality of going through months of data in a reasonable time. Clinicians will often focus on a few select variables and follow them over time, potentially missing other clinically important information.

ML provides an ideal method to analyze these data to look for patterns. Machines are significantly more efficient at analyzing large datasets like the ones described above and may yield unique insights that humans might miss.24 The complex interplay between genetics, prenatal factors, and postnatal course that mark the development of BPD are more amenable to analysis by AI than by individual humans. Potential uses of ML include developing prediction models by combining different antenatal risk factors with postnatal clinical, radiological, and laboratory parameters. Having an AI-based algorithm that monitors the patient over time may provide insights into ideal times for intervention – such as the use of postnatal steroids – which may significantly improve outcomes for those patients while minimizing exposure in patients who are less likely to benefit. One may be able to use an ongoing evaluation of vital signs, ventilator data, and lab values to produce a score to delineate the risk for respiratory decompensation specific to patients with BPD, helping clinicians to identify infants that need to be monitored more closely or use the same data as a prognostic marker to establish the need for additional specific outpatient monitoring or services.

There has been increasing interest in exploring the role of ML in this field. In a recent study looking at clinical data available at birth in addition to a gastric aspirates’ marker, researchers were able to predict the development of BPD in a cohort of patients with a sensitivity of 88% and a specificity of 91%.25 Another group of researchers used a combination of fourteen clinical variables to build an expert system to predict the development of BPD with an accuracy of 83% in the first week of life.26,27 Using the insights from these studies along with other variables such as imaging and respiratory support, the prediction models can potentially be improved, with the hope of eventually giving actionable data to mitigate the risk and sequalae of BPD in at-risk infants. In addition to the limitations of the clinical criteria definitions for BPD, these studies typically used those ambiguous diagnostic criteria to train their ML models, thereby rendering the models’ predictions to be of limited value. The use of unsupervised learning in this arena might be particularly useful, as it may reveal categorization of BPD by prognostic factors or therapeutic responsiveness not currently identified.

It is increasingly clear that defining BPD as one clinical entity based on the need for respiratory support is likely to be overly simplistic. It is likely what we currently classify BPD is a combination of different disease processes affecting lung tissue, airway, pulmonary vasculature, and respiratory control center in different combinations. This was reflected in a recent study where, using pulmonary function data, researchers were able to identify a purely restrictive phenotype of BPD, which classically is thought to have a significantly obstructive component.28 This identification of distinct BPD phenotypes (whether by ML or traditional methods) is crucial for the development and testing of targeted therapies. Perhaps several of the therapies that have been already tested might be useful treatment modalities for a specific subset of patients, even if they are not effective for all patients with BPD.

ML combined with natural language processing of clinical documentation has many potential applications in guiding the evaluation and course of BPD, such as identifying patients at need for referrals to services (such as physical, occupational or speech) or interventions (such as tracheostomy), which may lead to more a prompt evaluation and intervention.

Limitations and pitfalls

There are several limitations to using machine learning. One of the main dangers is the problem of overfitting a model to the data. Overfitting occurs when an algorithm gives undue weight to a feature that works to categorize the training dataset but fails to categorize a general dataset. The two main contributors to this are small dataset size and bias. One strategy to assess the degree of overfitting is to use a holdout dataset, where a certain percent of representative data are reserved prior to training so that the model can be benchmarked on these data that had no influence during training. In addition, it is important for the models to be tested in other scenarios and situations (for example, on a dataset from another institution or group) to assess their generalizability and applicability.

Within BPD, the issue of small dataset size may come up depending on the variables that are used as inputs. BPD remains a relatively rare condition when compared with many adult disease.29 In addition, while plain radiographs are commonly performed during the clinical course in infants with BPD, more advanced imaging modalities which are better able to delineate the extent of lung disease, such as computed tomography (CT), or magnetic resonance imaging (UTE MRI),30,31 are still infrequently performed due to the risk associated with ionizing radiation for the former or the cost, availability, and complexity of obtaining the later. The size of any current dataset with CT or MRI in infants with BPD is likely to be far too small to train image processing algorithms with most ML techniques. Though techniques like transfer learning or cross validation may help to augment the power of these algorithms,32 they must be used carefully to minimize the risk of overconfidence in the results.

An ideal scenario could be the development of algorithms that would take data that clinicians already collect on a regular basis for patient care for its inputs, requiring no additional equipment, monitoring, labs, or data gathering. This would allow the widest possible adoption of the algorithm and may lead to better standardization of care across institutions. However, this must be done carefully, as the frequency of labs and imaging varies significantly among institutions, so that an algorithm developed at one center may not be applicable at another.

While ML holds promise over traditional methods of analysis in some areas, its algorithms are not universally superior. In a large cohort of patients from the Canadian Neonatal Network, traditional logistic regression performed better than the tested ML models in predicting the outcome of BPD or death.33

In addition, machine learning simply looks at data that exists, and as such if the data is of poor quality and quantity, the models may not yield useful results.34 Because of the “black box” nature of some of the AI algorithms,35 particularly the deep learning-based algorithms, there may be clinician distrust if a result suggests an intervention that the clinician does not agree with.36,37 Clinician input for feature and model selection will help select model building blocks that are clinically plausible and reliable. Furthermore, models should undergo robust prospective validation on a large representative dataset before implementing in clinical practice. Models also must be continually retrained and revalidated to ensure that they continue to perform well in the context of temporal dataset shift, otherwise they may lose accuracy and effectiveness over time.38,39

Conclusions

Bronchopulmonary Dysplasia remains one of the most challenging diseases to prevent and manage in the field of neonatology. Since its discovery, clinicians and researchers have devoted significant time in trying to understand and characterize the pathology, the clinical phenotype, and outcomes. Additionally, developing optimal management strategies has remained difficult. Current research indicates that BPD is a significantly heterogeneous condition that will require the use of novel methods for precise diagnoses, targeted treatments, and optimal outcomes. AI offers several promising avenues of research which may offer new insights into this disease. IN contrast with the proliferation of AI models in other health fields, there is significantly lower penetration of this technology within pediatrics and especially in BPD. We are further still from the point where such models can be used in routine clinical decision making. Progress will undoubtedly be incremental, but there remains much untapped potential in the data that we already possess and continually collect. AI and ML may help develop more nuanced definitions of BPD and help guide evidence-based preventive and treatment strategies for BPD. We currently have a great opportunity to use these technologies to advance our understanding of BPD and to tackle its significant challenges.