Abstract
The goal of this study was to develop a deep learning-based algorithm to predict temporomandibular joint (TMJ) disc perforation based on the findings of magnetic resonance imaging (MRI) and to validate its performance through comparison with previously reported results. The study objects were obtained by reviewing medical records from January 2005 to June 2018. 299 joints from 289 patients were divided into perforated and non-perforated groups based on the existence of disc perforation confirmed during surgery. Experienced observers interpreted the TMJ MRI images to extract features. Data containing those features were applied to build and validate prediction models using random forest and multilayer perceptron (MLP) techniques, the latter using the Keras framework, a recent deep learning architecture. The area under the receiver operating characteristic (ROC) curve (AUC) was used to compare the performances of the models. MLP produced the best performance (AUC 0.940), followed by random forest (AUC 0.918) and disc shape alone (AUC 0.791). The MLP and random forest were also superior to previously reported results using MRI (AUC 0.808) and MRI-based nomogram (AUC 0.889). Implementing deep learning showed superior performance in predicting disc perforation in TMJ compared to conventional methods and previous reports.
Similar content being viewed by others
Introduction
Disc perforation occurs in the late stage of temporomandibular joint (TMJ) disease. It may affect treatment planning and can be useful in predicting the prognosis of the disease1,2,3. Magnetic resonance imaging (MRI) is considered the gold standard for examination of disc of TMJs2,4. However, results among several reports predicting disc perforation based on MRI vary and the diagnostic accuracy of MRI in detecting TMJ disc perforation is known to be poor2,3,5,6,7.
Artificial intelligence (AI) technology is beginning to affect our daily lives, the field of medicine not excepted. Indeed, the number of articles applying machine learning to medical research has been growing rapidly in recent years8,9. Several studies have been also conducted to evaluate the diagnosis and prognosis in the field of oral and maxillofacial surgery. Zhang et al. reported a model that predicts postoperative facial swelling after third molar extraction with 98% accuracy using an artificial neural network10. Kim et al. also applied machine learning technique to predict the occurrence of bisphosphonate related-osteonecrosis of the jaw11.
In addition, deep learning, a class of machine learning, is increasingly being applied in the field of diagnosis and prediction related to medical imaging, yielding impressive results12. Yang et al. reported favorable result for automated detection of cyst and tumor of the jaw in panoramic images13. Lee et al. reported that cephalometric images can be applied for differential diagnosis of orthognathic surgery and orthodontic treatment based on deep convolutional neural networks with 95.4 ~ 96.4% success rate14. More recently, there are many attempts for diagnosis of osteoarthritis of the temporomandibular joint based on cone-beam computed tomographic image using machine learning15,16.
Although deep learning based on medical imaging has had a monumental impact, processing a stack of MRI slices rather than a single image such as a fundus photograph requires considerable effort as well as significant computational resources in terms of memory and processing speed17,18,19. Moreover, despite the ability of deep learning to extract features directly from data, predictions made solely by machines are limited in terms of accuracy and reliability; moreover, they raise legal, ethical, and psychosocial issues8,9,12.
Thus in this study, experienced observers interpreted the TMJ MRI images to extract data on specific features for use in building machine learning-based prediction models. The goal of this study was to construct machine learning models to predict TMJ disc perforation based on experienced investigators’ MRI readings and the validate the performance of the models. Random forest and multilayer perceptron (MLP), a class of deep learning method, were used. There have been to our knowledge no published studies applying deep learning or machine learning to TMJ disc perforation.
Results
A total of 299 temporomandibular joints from 289 patients involved in this study. The characteristics of the study objects are shown in Table 1. In univariate analysis, there was significant difference statistically in all parameters except fluid collection between non-perforated and perforated group. A multiple logistic regression analysis was performed with parameters showing significance in the univariate analysis (Table 2). The factors significant in multivariable analysis were increased age, disc shape (eyeglasses or amorphous), low signal intensity of bone marrow in MRI, joint space, and changes in the condyle and fossa, consistent with a previous study5. Female patients were approximately twofold more likely to have disc perforation than male patients. When disc shape was amorphous, the possibility of disc perforation was increased almost 45-fold compared to normal disc shape.
The machine learning models were built based on the above results. The training progress is shown by plotting the loss of each iteration in Fig. 1. Statistically significant factors from the above analyses were considered when constructing the random forest and MLP models. MLP produced the highest performance (AUC 0.940), followed by random forest (AUC 0.918) and disc shape alone (AUC 0.791). (Fig. 2, Table 3). The AUC of MLP and random forest outperformed previous reports. (AUC 0.8083, AUC 0.8895).
The random forest model does yield the importance of the variables in each model, as shown in Fig. 3. According to the result of our model, shape of the disc appeared to have the most impact among the variables which were significant in multiple logistic analysis.
The sensitivity and specificity of each model at its optimal cutoff is also investigated. MLP model showed 85.2% of sensitivity and 84.8% of specificity. Random forest model showed 96.3% of sensitivity and 75.8% of specificity. They are summarized in Table 4.
Discussion
Presence of temporomandibular disc perforation does not necessarily mean the patient needs surgical intervention. However, identification of TMJ disc perforation may affect treatment planning since it can be seen in the late stage of TMJ arthrosis and the stage of internal derangement affects treatment planning2.
Although MRI is widely considered the gold standard in evaluating TMJ, clear criteria for diagnosing TMJ disc perforation have not yet been proposed3,5,7. The diagnostic value of MRI for identifying the presence of a perforated disc is reported to be limited, only a handful of reports having assessed it with ROC curves2,3,4.
Previously, Shen et al. reported an AUC of 0.808 (95% CI 0.77–0.85) by diagnosing TMJ perforation with MRI3. One thing to consider is that this study included 2524 joints, but only 207 joints were perforated. This imbalance between the numbers of cases and controls allows high diagnostic accuracy, exceeding 90%, despite the sensitivity being as low as 0%. In other words, if there are only 10 perforated joints among 100 joints, diagnostic accuracy of 90% is easily achieved by simply diagnosing every joint as non-perforated, while none of the perforated joints are diagnosed correctly. The study still has significance in that it is the first report assessing efficacy of MRI-based diagnosis of TMJ disc perforation based on ROC curve rather than only sensitivity and specificity. The ROC curve is an efficient method for assessing a diagnostic test as it visualizes all possible combinations of true positive rates and false positive rates.
A previous study which constructed a nomogram based on MRI findings reported the AUC to be 0.889 (95% CI 0.804–0.973)5. In this study, the best performing deep learning model showed an AUC of 0.940 (95% CI 0.884–0.995). This is the highest result reported, despite being validated with mutually exclusive datasets from the training set.
Deep learning is yielding quantum leaps in a wide range of technologies affecting our lives. Face recognition, voice-to-text, personal assistants, and natural language understanding have become commonplace, and self-driving cars are on the horizon. The field of medicine is no exception: as the number of healthcare startups using artificial intelligence is increasing, so are markets involving this technology rapidly emerging. Across the board, medical imagery will likely constitute a primary input for practical applications using AI in healthcare.
Classic data-driven approaches in radiology depend on features seen as important from a human radiologist’s point of view, such as density, heterogeneity of tumors, shape, etc20. Convolutional neural network (CNN), a class of deep learning, automatically discovers the best features for a given task without requiring human-mediated feature selection12,20. CNN reaches or sometimes exceeds human performance in specific tasks.
While this study implemented a deep learning approach to construct a TMJ disc perforation detection model, feature selection and assessment were done by doctors and not by computer. A large amount of data is required for a well-performing CNN, which automates the feature selection process. Machine learning will only get better over time as data sets increase in size and computing power grows21. Given the small amount of available data, deep learning with human-crafted features performs better20. Moreover, it does not require extensive computing power as well. Also, it has been noted that even minor changes to the input data, often invisible to human eye, can result in dramatically different classifications20,22. Human verification is thus still required.
The “black box” nature of AI-based diagnosis, meaning the inability to identify the reason for each decision, is another limitation9. Doctors will rarely follow the advice of a machine if they cannot see the reasoning underlying that advice, especially when the responsibility for the patient remains with the clinicians9,23. There are ongoing studies on this, some of them achieving a measure of interpretability. However, though not fully interpretable, the random forest model does yield the importance of the variables in each model, as shown in Fig. 3.
In this study, temporomandibular disc perforation was confirmed by surgeon during open TMJ surgery. In some cases, however, the perforation was not identified according to the size or location of disc perforation, which is a limitation of this study. It is thought that the most optimal condition is to interpret and analyze all images by AI technique, if disc perforation can be diagnosed more completely through image. Recently, Chauhari et al. reported that super-resolution magnetic resonance images were created by deep learning and the images can be used for diagnosis24. With the advancement of AI technology and the development of image technology, it is expected that image interpretation and prediction of AI alone in the future.
Machine learning approaches are increasingly finding application in the field of medicine and will benefit patients whose doctors have learned to implement them. The history of medicine is replete with cases in which new techniques have been adopted by a few with notable success and then become widespread. To aid readers wishing to try out this new technology, we are sharing the R code used in this study as an attachment (Supplementary Data).
Methods
Patients
443 patients received open TMJ surgery between January 2005 and January 2018 at the Department of Oral and Maxillofacial Surgery, Gangnam Severance Hospital, Yonsei University. The following criteria were excluded: diagnosed tumors such as osteochondroma and synovial chondromatosis; a congenital deformity such as hemifacial microsomia or hemifacial hyperplasia; the absence or inadequate quality of TMJ MRI; and prior total TMJ replacement or arthroplasty.
This retrospective study was approved by the Institutional Review Board of Gangnam Severance Hospital (No. #3-2018-0129) and complied with the tenets of the Declaration of Helsinki. Written or verbal informed consent was not obtained from any participants because the IRB of Gangnam Severance Hospital waived the need for individual informed consent, as this study had a non-interventional retrospective design and all data were analyzed anonymously.
Following this selection process, 299 joints from 289 patients remained. These were divided into two groups, perforation and non-perforation, based on the existence of disc perforation identified during surgery. The characteristics of the study objects are shown in Table 1.
Statistical analyses
Comparison of parameters between groups in Table 1 were assessed using the chi square test, Fisher’s exact test, and the Cochran-Armitage trend test for categorical variables. The Mann–Whitney rank sum test was used for continuous variables. Statistical analyses were performed using the R programming language (R Core Team, Vienna, Austria, 2018).
Magnetic resonance imaging
MRI was acquired on a 3.0-T Magnetom scanner (Achieva, Philips Medical, Best, The Netherlands) with 3-inch surface coils for the TMJs. For T1-weighted imaging, the following parameters were used: repetition time, 450 ms; echo time, 20 ms; slice thickness, 3 mm; field of view, 120 mm; and acquisition matrix size, 240 × 240. The parameters for T2-weighted imaging were as follows: repetition time, 2,900 ms; and echo time, 90 ms. Sagittal plane MRI was analyzed by two oral and maxillofacial surgeons and an oral and maxillofacial radiologist with reference to previous studies3,5,7,25. All observers were experienced in TMJ MRI interpretation. Images were interpreted at the same time and a final decision was made by consensus.
Disc shapes, bone marrow signal, relationship between the disc and condyle, joint space, and changes of condyle and fossa were investigated. More detailed information and figures of each parameter have been described in a previous report5. The data derived from these features were used to build a prediction model.
Disc shapes
“Biconcave” refers to the normal disc structure and position. “Folded” describes discs with either a cap- or cup-shaped (∩ -or ∪ -shaped) configuration. A “flattened” disc has a loss of the voluminous configuration of the anterior band, posterior band, or both. A disc shortened antero-posteriorly resembles a pair of eyeglasses and was named as such. A deformed disc without a distinguishing configuration was classified as “amorphous.” A disc falling into more than 2 of the above categories was classified into the more deformed category.
Signal intensity of the bone marrow
Signal intensity of the bone marrow was assessed based on T1-weighted images. When the signal intensity of the condyle was lower than that of the ramus or body of the mandible, it was considered a low bone marrow signal; otherwise it was considered normal.
Fluid collection
Fluid collection was considered present if high signal intensity was observed within the joint spaces on at least two consecutive T2-weighted sagittal MRI. The amount of collected fluid was divided into 4 grades, from G0 to G3, where G0 refers to no fluid, G1 limited to the verge of the disc, G2 extended over the verge, and G3 when capsular expansion was observed.
Joint space
The narrowing of joint space between the condyle and fossa was divided into 4 categories: normal, narrowing, bone-to-bone contact while mouth closed, and bone-to-bone contact on mouth opening.
Changes of condyle and fossa
The presence of the following 5 features of mandibular condyle and 1 feature of articular fossa, 6 features in all, were investigated: osteophyte, erosion, sclerosis, flattening, and superiorly forming bony projection (spur) of mandibular condyle and signal changes of articular fossa2,5,7. The number of features were counted.
Machine learning
Prior to formulation of machine learning models, the data set was randomly divided into two mutually exclusive sets, training (80%) and validation (20%)26. The training set was used to construct the prediction model and the validation set was used to validate the performance of each model. Area under the Receiver Operating Characteristic (ROC) curve (AUC) was used to compare the performance of the models with one another, and also with those in previous reports.
A concise description of each machine learning algorithm is provided below. All machine learning models were implemented using the Keras framework27 with the R programming language (R Core Team, Vienna, Austria, 2016)26.
Random forest
Random forest is a tree-based machine learning algorithm which creates subsets of decision trees and combines weak outputs of the trees to yield highly accurate results by calculating the vote of each tree28. Each decision tree predicts the value of a target variable based on several input variables by repeated classification, also known as recursive partitioning28,29. The subsets are created by multiple iterations of random sampling, 500 times in this study. The R package randomForest was employed28.
Multilayer perceptron (MLP)
Multilayer perceptron (MLP) is a class of artificial neural network (ANN) with multiple, or deep, layers of nodes. Each neuronal node connects with others in patterns similar to those in animal neurons and uses a non-linear activation function. This non-linear characteristic makes it possible to distinguish linearly inseparable data. The Keras framework27, a recent deep learning interface, was employed to construct an MLP model in this study.
The MLP model architecture used in this study, illustrated in Fig. 4, is composed of an input layer, three fully connected hidden layers and an output layer. The input layer refers to input data such as features extracted from TMJ MRI. The hidden layers are those where the input features are computed. The node in the output layer represents the computed prediction result27.
A neural network is trained by adjusting the weights and biases of each node. These parameters are repeatedly adjusted via an optimization algorithm called gradient descent27. Each time predictions are computed from a given data sample (forward propagation), the network performance is assessed through a loss function that measures the error of the prediction. Each network parameter is then adjusted in small increments in the direction that minimizes the loss, a process called back-propagation27,30. This MLP learning process is shown in Fig. 1.
Conclusion
Implementing deep learning showed superior performance in predicting disc perforation in TMJ compared to conventional methods and previous reports.
References
Dimitroulis, G. The prevalence of osteoarthrosis in cases of advanced internal derangement of the temporomandibular joint: A clinical, surgical and histological study. Int. J. Oral Maxillofac. Surg. 34, 345–349 (2005).
Kuribayashi, A., Okochi, K., Kobayashi, K. & Kurabayashi, T. MRI findings of temporomandibular joints with disk perforation. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. Endod. 106, 419–425 (2008).
Shen, P. et al. Magnetic resonance imaging applied to the diagnosis of perforation of the temporomandibular joint. J. Craniomaxillofac. Surg. 42, 874–878 (2014).
Limchaichana, N., Petersson, A. & Rohlin, M. The efficacy of magnetic resonance imaging in the diagnosis of degenerative and inflammatory temporomandibular joint disorders: A systematic literature review. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. Endod. 102, 521–536 (2006).
Kim, J. Y., Jeon, K. J., Kim, M. G., Park, K. H. & Huh, J. K. A nomogram for classification of temporomandibular joint disk perforation based on magnetic resonance imaging. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. 125, 682–692 (2018).
Rao, V. M. & Bacelar, M. T. MR imaging of the temporomandibular joint. Neuroimaging Clin. N. Am. 14, 761–775 (2004).
Yura, S., Nobata, K. & Shima, T. Diagnostic accuracy of fat-saturated T2-weighted magnetic resonance imaging in the diagnosis of perforation of the articular disc of the temporomandibular joint. Br. J. Oral Maxillofac. Surg. 50, 365–368 (2012).
Burt, J. R. et al. Deep learning beyond cats and dogs: Recent advances in diagnosing breast cancer with deep neural networks. Br. J. Radiol. 91, 20170545 (2018).
Fazal, M. I., Patel, M. E., Tye, J. & Gupta, Y. The past, present and future role of artificial intelligence in imaging. Eur. J. Radiol. 105, 246–250 (2018).
Zhang, W., Li, J., Li, Z. B. & Li, Z. Predicting postoperative facial swelling following impacted mandibular third molars extraction by using artificial neural networks evaluation. Sci. Rep. 8, 12281 (2018).
Kim, D. W., Kim, H., Nam, W., Kim, H. J. & Cha, I. H. Machine learning to predict the occurrence of bisphosphonate-related osteonecrosis of the jaw associated with dental extraction: A preliminary report. Bone 116, 207–214 (2018).
Chartrand, G. et al. Deep learning: A primer for radiologists. Radiographics 37, 2113–2131 (2017).
Yang, H. et al. Deep learning for automated detection of cyst and tumors of the jaw in panoramic radiographs. J. Clin. Med. 9, 1839 (2020).
Lee, K.-S., Ryu, J.-J., Jang, H. S., Lee, D.-Y. & Jung, S.-K. Deep convolutional neural networks based analysis of cephalometric radiographs for differential diagnosis of orthognathic surgery indications. Appl. Sci. 10, 2124 (2020).
Bianchi, J. et al. Osteoarthritis of the Temporomandibular Joint can be diagnosed earlier using biomarkers and machine learning. Sci. Rep. 10, 8012 (2020).
Lee, K. S. et al. Automated detection of TMJ osteoarthritis based on artificial intelligence. J. Dent. Res. 99, 1363–1367 (2020).
Akkus, Z., Galimzianova, A., Hoogi, A., Rubin, D. L. & Erickson, B. J. Deep learning for brain MRI segmentation: State of the art and future directions. J. Digit. Imaging 30, 449–459 (2017).
Işin, A., Direkoǧlu, C. & Şah, M. Review of MRI-based brain tumor image segmentation using deep learning methods. Proc. Comput. Sci. 102, 317–324 (2016).
Plis, S. M. et al. Deep learning for neuroimaging: A validation study. Front. Neurosci. 8, 229 (2014).
Savadjiev, P. et al. Demystification of AI-driven medical image interpretation: Past, present and future. Eur. Radiol. 29, 1616–1624 (2018).
Chockley, K. & Emanuel, E. The end of radiology? Three threats to the future practice of radiology. J. Am. Coll. Radiol. 13, 1415–1420 (2016).
Szegedy, C. et al. Intriguing properties of neural networks. arXiv preprint arXiv 1312, 6199 (2013).
Teach, R. L. & Shortliffe, E. H. An analysis of physician attitudes regarding computer-based clinical consultation systems. Comput. Biomed. Res. 14, 542–558 (1981).
Chaudhari, A. S. et al. Utility of deep learning super-resolution in the context of osteoarthritis MRI biomarkers. J. Magn. Reson. Imaging 51, 768–779 (2020).
Huh, J. K., Kim, H. G. & Ko, J. Y. Magnetic resonance imaging of temporomandibular joint synovial fluid collection and disk morphology. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. Endod 95, 665–671 (2003).
Kim, W. et al. Development of novel breast cancer recurrence prediction model using support vector machine. J. Breast Cancer 15, 230–238 (2012).
Chollet, F., Allaire, J. J. & others R Interface to Keras (GitHub, 2017).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Breiman, L., Friedman, J., Stone, C. J. & Olshen, R. A. Classification and Regression Trees (Chapman & Hall, 1984).
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning Internal Representations by Error Propagation (California Univ San Diego La Jolla Inst for Cognitive Science, 1985).
Author information
Authors and Affiliations
Contributions
The authors contributed to this study as “J.Y.K. and D.W.K. investigated data and wrote manuscript; J.Y.K. and K.J.J. read magnetic resonance imaging; H.K. revised the code for artifical intelligence; J.K.H. revised and edited final manuscript". All authors reivewed the manuscriptAll authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kim, JY., Kim, D., Jeon, K.J. et al. Using deep learning to predict temporomandibular joint disc perforation based on magnetic resonance imaging. Sci Rep 11, 6680 (2021). https://doi.org/10.1038/s41598-021-86115-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-021-86115-3
This article is cited by
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.