Improving performance of deep learning models using 3.5D U-Net via majority voting for tooth segmentation on cone beam computed tomography

Hsu, Kang; Yuh, Da-Yo; Lin, Shao-Chieh; Lyu, Pin-Sian; Pan, Guan-Xin; Zhuang, Yi-Chun; Chang, Chia-Ching; Peng, Hsu-Hsia; Lee, Tung-Yang; Juan, Cheng-Hsuan; Juan, Cheng-En; Liu, Yi-Jui; Juan, Chun-Jung

doi:10.1038/s41598-022-23901-7

Download PDF

Article
Open access
Published: 17 November 2022

Improving performance of deep learning models using 3.5D U-Net via majority voting for tooth segmentation on cone beam computed tomography

Kang Hsu^1,2,
Da-Yo Yuh¹,
Shao-Chieh Lin^3,4,
Pin-Sian Lyu^3,5,
Guan-Xin Pan^3,6,
Yi-Chun Zhuang^3,6,
Chia-Ching Chang^3,7,
Hsu-Hsia Peng⁸,
Tung-Yang Lee^6,9,
Cheng-Hsuan Juan^3,6,9,
Cheng-En Juan⁵,
Yi-Jui Liu⁵^na1 &
…
Chun-Jung Juan^{3,8,10,11,12,13}^na1

Scientific Reports volume 12, Article number: 19809 (2022) Cite this article

2225 Accesses
10 Citations
Metrics details

Subjects

Abstract

Deep learning allows automatic segmentation of teeth on cone beam computed tomography (CBCT). However, the segmentation performance of deep learning varies among different training strategies. Our aim was to propose a 3.5D U-Net to improve the performance of the U-Net in segmenting teeth on CBCT. This study retrospectively enrolled 24 patients who received CBCT. Five U-Nets, including 2Da U-Net, 2Dc U-Net, 2Ds U-Net, 2.5Da U-Net, 3D U-Net, were trained to segment the teeth. Four additional U-Nets, including 2.5Dv U-Net, 3.5Dv5 U-Net, 3.5Dv4 U-Net, and 3.5Dv3 U-Net, were obtained using majority voting. Mathematical morphology operations including erosion and dilation (E&D) were applied to remove diminutive noise speckles. Segmentation performance was evaluated by fourfold cross validation using Dice similarity coefficient (DSC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV). Kruskal–Wallis test with post hoc analysis using Bonferroni correction was used for group comparison. P < 0.05 was considered statistically significant. Performance of U-Nets significantly varies among different training strategies for teeth segmentation on CBCT (P < 0.05). The 3.5Dv5 U-Net and 2.5Dv U-Net showed DSC and PPV significantly higher than any of five originally trained U-Nets (all P < 0.05). E&D significantly improved the DSC, accuracy, specificity, and PPV (all P < 0.005). The 3.5Dv5 U-Net achieved highest DSC and accuracy among all U-Nets. The segmentation performance of the U-Net can be improved by majority voting and E&D. Overall speaking, the 3.5Dv5 U-Net achieved the best segmentation performance among all U-Nets.

Automatic mandibular canal detection using a deep convolutional neural network

Article Open access 31 March 2020

Artificial intelligence in diagnosing dens evaginatus on periapical radiography with limited data availability

Article Open access 14 August 2023

A fully automatic AI system for tooth and alveolar bone segmentation from cone-beam CT images

Article Open access 19 April 2022

Introduction

Cone beam computed tomography (CBCT) has been widely applied to orthodontics, periodontics, endodontics, stomatology, dental implant surgery, maxillofacial surgery, and forensic odontology^1,2. It is superior to panoramic radiography and periapical radiography by providing 3D information rather than 2D information and has advantages over conventional CT including, but not limited to, lower radiation doses and lower costs.

Rapid, accurate, and robust segmentation of human teeth on CBCT is an important foundation of clinical practice in dentistry. It allows clear visualization of teeth on the one hand, and, is helpful for qualitative evaluation and quantitative analysis of dental diseases such as caries^3,4, impacted tooth⁵, acute pulpitis⁶, apical periodontitis⁷, root fracture and periodontal lesion⁴. Manual segmentation by experts is usually considered as gold standard. However, it is laborious and time-consuming with the segmentation performance varying among different experts⁸. Semiautomatic segmentation facilitates the process of segmentation and is less laborious and less time-consuming with comparable segmentation performance with manual segmentation^9,10. Automatic segmentation outperforms manual and semiautomatic segmentation by providing rapidest and most efficient segmentation of teeth¹¹. However, automatic segmentation has been shown inferior to manual segmentation and semiautomatic segmentation in calculating tooth volume using water displacement method as gold standard⁹. In addition, automatic segmentation of teeth on CBCT remains challenging because of the more severe artifacts such as beam hardening artifacts^12,13, unsharpness^12,13,14, ring-like artifacts^13,14, partial volume averaging¹³, undersampling¹³, cone-beam effect^13,14, noises¹⁵, aliasing artifacts, and poorer soft-tissue contrast as compared to conventional CT¹⁶.

Deep learning is a subset of machine learning. Encouraged by the human neural structures, deep learn learns to think as the human brain by implementing multi-layer artificial neural networks. Supervised learning is the most common form of deep learning although the learning can also be semi-supervised or unsupervised. By feeding labeled data, including but not limited to images, into the complex and non-linear neural networks, deep learning works mimicking the human neural networks and gives results that enable us to detect, classify, and segment objects in interest¹⁷. Recently deep learning has a lot of attention because it can perform as good as human and even better in specific tasks.

First proposed in 2015 by Ronneberger et al.¹⁸, U-Net has been widely applied for medical imaging segmentation because it provides context information using fewer time and smaller data to train¹⁹. The U-Net contains a contraction path and an expansion path to encode the data using convolution and decode the data using up-convolution, respectively. It also concatenates the encoder and decoder by copying and cropping the input image to match the size of feature maps between the encoder and decoder layer by layer so that the net can not only classify but also localize the object for segmentation.

Several U-Nets including 2D U-Net^20,21, 2.5D U-Net²², and 3D U-Net²³ have been proposed for CBCT segmentation. A variant of 2.5D U-Net using majority voting of 2D U-Nets trained by 3 orthogonal imaging planes has been shown to outperform any single U-Net for maxillary and mandibular bony structure segmentation on CBCT²⁴. To the best of our knowledge, CT using a 3.5D U-Net integrating 2D U-Nets, 2.5D U-Net, and 3D U-Net has never been documented yet.

We hypothesized that the segmentation performance of a 3.5D U-Net might be improved using majority voting by reducing the false positive results occurring in 2D U-Net, 2.5D U-Net and 3D U-Net. In this study, we intentionally applied 6 previously introduced U-Nets including three orthogonal 2D U-Nets, two 2.5D U-Nets, plus a 3D U-Net and added three newly proposed 3.5D U-Nets by integrating 2D U-Nets, 2.5D U-Nets and 3D U-Net using the majority voting method for segmentation of teeth on CBCT. The proposed 3.5D U-Nets were compared to the previous U-Nets using slice-by-slice calculation of Dice similarity coefficient (DSC) and other diagnostic metrics including accuracy (Ac), sensitivity (Sn), specificity (Sp), positive predictive value (PPV), and negative predictive value (NPV) to verify our hypothesis.

Materials and methods

This study was approved by the Institutional Review Board of China Medical University with written informed consent waived for this retrospective study. All methods were performed in accordance with the relevant guidelines and regulations.

Patient cohort and CBCT parameters

Figure 1 demonstrates the processes from noise removing, patient selection, GT labeling, data augmentation and patient grouping in our study. A total of 194 patients who received CBCT study from January to June 2020 were initially collected. All patients were scanned using an Auge Solio CBCT scanner (Asahi Roentgen Ind., Kyoto, Japan) that is widely used in dentistry and maxillofacial surgery. All scans were performed using a tube voltage of 85 kVp, a tube current of 6 mA, and an isotropic voxel size of 0.19 mm. The imaging protocol covered from the inferior orbital rim to the inferior end of the mandible.

In order to minimize the potential influence of metal-related artifacts on the segmentation task, one of our exclusion criteria was patients with heavy metallic dental burden (MDB) including metallic dental implants, braces and crowns. CBCTs with heavy MDB due to metallic dental devices were automatically identified according to the following steps and excluded. First, two thresholds were empirically set with the first threshold (TH1) of 3070 HU and the second threshold (TH2) of 2500 HU, representing the density of metallic materials and the density of enamel, respectively. Second, MDB ratio (MDBR) was defined via dividing TH1 by TH2. Third, a third threshold (TH3) was set with the MDBR = 0.4. Fourth, heavy MDB was defined by MDBR > TH3. Fifth, patients with heavy MBD were excluded. A total of 24 patients were randomly selected from the rest of patients for segmentation of teeth in this study to prevent huge loading of manpower in defining the ground truth (GT). Patients were classified into 4 subsets, in which each subset containing same number of patients (N = 6) with the GT defined by different observers.

Imaging preprocessing

In order to remove high frequency noise in CBCT, a 3D Gaussian filter with standard deviation of 1 was applied first. All teeth were slice-by-slice contoured semiautomatically on CBCT by four different observers including one dentist (K.H. with 6-year experience in medical imaging research) and three researchers majoring in medical imaging analysis (P.S.L., G.X.P. and Y.C.Z. with one more year of experience in medical imaging analysis). The semiautomatic method is modified from that used in our previous study using thresholding method²⁵. First, the CBCT images were loaded and displayed. Second, a polygonal region-of-interest (ROI) encompassing teeth was drawn. Third, a threshold was initially applied and then adjusted to fit the contour of teeth. Four, holes within the contour of teeth were filled. Finally, all images with teeth successfully contoured were save as GT. All GTs were verified by a neuroradiologist (C.J.J. with more than 20 years of experience in medical imaging analysis).

Data augmentation with an augmentation factor of 2 was achieved by flipping all images along the horizontal direction. For fair comparison among the original U-Nets, no additional data augmentation was performed for either 2.5D U-Net or 3D U-Net.

Deep learning models (DLMs)

U-Net was employed for semantic segmentation of teeth in this study¹⁸. The U-Net architecture consists of a decoding path and an encoding path symmetrically. The decoding path contains two convolution blocks in each layer with each convolution block followed by a rectified linear unit (Relu) to obtain lower-dimensional representation and then down-sampled by a max pooling operation. In the encoding path, the representation is concatenated with the corresponding features maps obtained in the encoding path, followed by two convolution blocks, and then up-sampled by nearest convolution operation. The final output layer of the U-Net was connected to a dual-class softmax classifier, i.e., teeth and non-teeth.

In our previous studies, we found the segmentation performance of 2D U-Net varies between different lesions with the DSC ranging from as low as 0.48 in salivary gland tumors²⁶ to as high as 0.97 in acute ischemic stroke lesion²⁵ on magnetic resonance imaging. In this study, we intentionally employed a total of nine different DLMs to perform automatic segmentation of the teeth. First, three sets of orthogonal images were applied to train axial, coronal, and sagittal 2D U-Nets (named as 2Da U-Net, 2Dc U-Net, and 2Ds U-Net). Second, a 2.5D U-Net was constructed using three continuous axial slices placed in three channels to form an ensemble input image and to train the DLM (named as 2.5D U-Net). Third, a 3D U-Net was constructed using a cuboid (64 × 64 × 128) as an input image. Architectures and hyperparameters of these U-Nets are shown in Table 1. Finally, we applied majority voting to create 4 additional U-Nets. Via combining the predictions of 2D U-Nets trained from each of three orthogonal slices²⁴ using majority voting, a 2.5Dv U-Net was generated. Three additional 3.5D U-Nets (i.e., 3.5Dv3 U-Net, 3.5Dv4 U-Net, and 3.5Dv5 U-Net) were generated via majority voting the predictions of 2D U-Nets, 2.5D U-Net, and 3D U-Net at different combination strategies as illustrated in Fig. 2.

Table 1 Architectures and hyperparameters of 2D U-Net, 2.5Da U-Net, and 3D U-Net structures.

Full size table

Prediction of each of aforementioned nine U-Nets was treated by basic operations of mathematical morphology, i.e., erosion and dilation. The binary erosion of I by B, denoted by $I\ominus B$, is defined as Eq. (1):

$$I\ominus B=\left\{z\in E|{B}_{z}\subseteq I\right\},$$

(1)

where E denotes a Euclidean space, I denotes a binary image in E, B denotes a spherical structuring element with a radius of 2 pixels, and B_z denotes the translation of B by the vector z. The binary dilation of I by B, denoted by $I\oplus B$, is defined as Eq. (2):

$$I\oplus B=\left\{z\in E|{({B}^{s})}_{z}\cap I\ne \phi \right\},$$

(2)

where B denotes a spherical structuring element with a radius of 2 pixels, B^s denotes the symmetric of B as defined by Eq. (3):

$${B}^{s}=\left\{x\in E|-x\in B\right\}$$

(3)

Cross validation and model performance evaluation

The flowchart of U-Nets in automatic segmentation of teeth using fourfold cross validation was shown in Fig. 3²⁷. Slice-based evaluation of the performance of a DLM was conducted using four-fold cross validation to reflect the performance of a DLM in every slice²⁸. The overall segmentation performance was calculated by averaging the performance of every slice²⁸. Each voxel of the CBCT image was defined as true positive (TP), true negative (TN), false positive (FP) and false negative (FN) by comparing the prediction to the GT. Segmentation performance of DLMs was evaluated using DSC, Ac, Sn, Sp, PPV, and NPV defined by Eqs. (4) to (9), respectively.

$$Ac=\frac{TP+TN}{FP+TP+FN+TN}$$

(4)

$$DSC=\frac{2TP}{FP+2TP+FN}$$

(5)

$$Sn=\frac{TP}{TP+FN}$$

(6)

$$Sp=\frac{TN}{TN+FP}$$

(7)

$$PPV=\frac{TP}{TP+FP}$$

(8)

$$NPV=\frac{TN}{TN+FN}$$

(9)

Statistical analysis

In statistical analyses, the normality of data was analyzed first using Kolmogorov–Smirnov test first. Paired Wilcoxon rank test was used to compare continuous data before and after E&D. A nonparametric Kruskal–Wallis test with post hoc analysis using Bonferroni correction was applied for group comparison among 9 U-Nets. A P value less than 0.05 was considered as statistically significant.

Results

A total of 24 patients were finally recruited, including 15 men and 9 women, with an age of 29.1 ± 14.7 years (mean ± standard deviation). Demographic characteristics of different subsets and groups of patients were summarized in Table 2. There was no difference of age among different subsets of patients (P = 0.5658). Impacted teeth were the most common clinical diagnosis, comprising 75% (18 of 24) of patients received CBCT examination.

Table 2 Demographics of patients in different subset.

Full size table

Comparisons of DSC among U-Nets

Comparisons of DSC among nine different U-Nets before and after E&D were shown on Fig. 4 and Table S1. The DSC after E&D was significantly different that before E&D in all U-Nets (all P < 0.01). While the DSC after E&D was significantly higher than that before E&D in 5 originally trained U-Nets (all P < 0.005), it was significantly lower than that before E&D in 4 U-Nets generated after majority voting (all P < 0.01). Before E&D, the 3.5Dv5 U-Net achieved highest DSC which was significantly higher than any of five originally trained U-Nets (all P < 0.005), while the 2Da U-Net and 2.5D U-Net performed poorest with DSC significantly lower than other U-Nets (P < 0.005) except 3D U-Net (P = 0.174 to 0.222). After E&D, the 3.5Dv5 U-Net achieved highest DSC which was significantly higher than most U-Nets (P < 0.01) except 2.5Dv U-Net (P = 0.551) and 2.5Da U-Net (P = 0.07).

Comparisons of accuracy among U-Nets

Comparisons of accuracy among 9 different U-Nets before and after E&D were shown on Fig. 5 and Table S2. The accuracy after E&D was significantly different that before E&D in all U-Nets (all P < 0.01) with the median accuracy higher than 0.997 in all U-Nets no matter before or after E&D. While the accuracy after E&D was significantly higher than that before E&D in 5 originally trained U-Nets (all P < 0.01), the it was significantly lower than before E&D in 4 U-Nets generated after majority voting (all P < 0.005). Before E&D, the 3.5Dv5 U-Net achieved highest accuracy which was significantly higher than that of 2.5Da U-Net, 3D U-Net, 3.5Dv3 U-Net, and 3.5Dv4 U-Net (P < 0.01). After E&D, the 3.5Dv5 U-Net still achieved highest accuracy, which was significantly higher than 2.5Da U-Net, 3D U-Net, 3.5Dv3 U-Net, and 3.5Dv4 U-Net (P < 0.05).

Comparisons of sensitivity among U-Nets

Comparisons of sensitivity among nine different U-Nets before and after E&D was shown on Fig. 6 and Table S3. Before E&D, the 2Dc U-Net achieved highest sensitivity, followed by the 2Ds U-Net, 2Da U-Net, 2.5Da U-Net, and 3.5Dv5 U-Net (P = 0.243 to 1), which was significantly higher than that of the 3D U-Net (P < 0.05) and other U-Nets with majority voting (P < 0.005). E&D significantly reduced the sensitivity in all U-Nets (all P < 0.005). After E&D, the 2Da U-Net achieved highest sensitivity, followed by 2Dc U-Net, 2Ds U-Net, 2.5Da U-Net, and 3.5Dv5 U-Net (P = 0.141 to 1), which was significantly higher than that of the 3D U-Net (P < 0.05) and other U-Nets with majority voting (P < 0.005).

Comparisons of specificity among U-Nets

Comparisons of specificity among nine different U-Nets before and after E&D was shown on Fig. 7 and Table S4. The specificity after E&D was significantly higher than that before E&D in all U-Nets (all P < 0.005) with the median specificity higher than 0.998 in all U-Nets before or after E&D. The 3.5Dv3 U-Net and 2.5Dv U-Net achieved a median specificity of 1, significantly higher than that of the 3.5Dv5 U-Net (P < 0.05) and all 5 originally trained U-Nets no matter before or after E&D (all P < 0.005).

Comparisons of PPV among U-Nets

Comparisons of PPV among nine different U-Nets before and after E&D was shown on Fig. 8 and Table S5. The PPV was improved after E&D in all U-Nets (all P < 0.005). Before E&D, the 2Da U-Net and 2.5Da U-Net performed poorest with the PPV significantly lower than that of other U-Nets (P < 0.05) except the 3D U-Net (P = 0.197). The 3.5Dv3 U-Net achieved highest PPV which was similar to the 3.5Dv4 U-Net, 3.5Dv5 U-Net, and 2.5D U-Net (P = 0.405 to 0.922) but significantly higher than that of all 5 originally trained U-Nets (all P < 0.005). After E&D, the 2Da U-Net and 2.5Da U-Net performed similar to other originally trained U-Nets (P = 849 to 1). The 3.5Dv3 U-Net still achieved highest PPV which was similar to the 3.5Dv4 U-Net, 3.5Dv5 U-Net, and 2.5D U-Net (P = 0.184 to 0.995) but significantly higher than all 5 originally trained U-Nets (all P < 0.005).

Comparisons of NPV among U-Nets

Comparisons of NPV among nine different U-Nets before and after E&D was shown on Fig. 9 and Table S6. E&D significantly reduced the NPV in all U-Nets (all P < 0.005) with the median NPV higher than 0.997 in all U-Nets before or after E&D. The 2Dc U-Net achieved highest NPV, followed by 2Da U-Net, 2.5Da U-Net, 2Ds U-Net, and 3.5Dv5 U-Net (P = 0.278 to 1), and significantly higher than 3D U-Net (P < 0.01) and other U-Nets with majority voting (P < 0.005) no matter before or after E&D.

Case demonstration

Figures 10 and 11 demonstrate the 3D illustration of predictions and error maps of 4 different U-Nets before and after E&D in two patients.

Discussion

Accurate segmentation of bony structures and teeth on CBCT is an important foundation of stomatology. Training strategy has been shown to be a factor influencing the segmentation performance of convolutional neural network (CNN) for bony structures on CBCT²⁴. In our study, we intentionally applied nine different training strategies based on the U-Net architecture and compared the performance in teeth segmentation on CBCT among different strategies. Our study demonstrated that the segmentation performance of the U-Net varied among different training strategies. The 2Da U-Net and the 2.5Da U-Net had poor segmentation performance with a median DSC of 0.464 and 0.469, respectively. The segmentation performance of the 2Da U-Net was improved via 3 strategies. First, by changing the input imaging data, the median DSC was significantly improved to 0.752 and 0.766 in the 2Dc U-Net and the 2Ds U-Net, respectively (via changing slice orientation) and slightly improved to 0.653 in the 3D U-Net (via supplying additional z-axis information). Second, by using majority voting, the median DSC was significantly improved to 0.922 (3.5Dv5 U-Net). Third, by employing mathematical morphology using E&D, the median DSC was significantly improved to 0.836 and 0.865 in the 2Da U-Net and the 2.5Da U-Net, respectively. Table 3 compares the segmentation performance of our proposed methods to those proposed by other researchers. The DSC in our study is relatively lower than some previous studies^{20,21,27,29,30,31,32}, in which the DSC ranges from 0.934³¹ to 0.97³⁰. In our study, we calculated the DSC slice-by-slice and then averaged the DSC of all slices rather than calculated the DSC for the whole CBCT volume as other studies^{20,21,23,27,29,30,31,33,34,35,36,37}. Nevertheless, the highest DSC achieved by our 3.5Dv5 U-Net is consistent with other previous studies^23,33,34,35, in which the DSC ranges from 0.9²³ to 0.921³³. Our study achieved an accuracy ranging from 0.997 to 0.999 which is higher than that reported in previous studies^30,36,37. Our 2D U-Nets achieved a sensitivity ranging from 0.934 to 0.943 which is similar to that (0.91 to 0.94 and 0.932) of Fontenele’s study³⁰ and Lee’s study³⁴, respectively, and higher than that (0.83) of Shaheen’s study²³. In addition, our U-Nets with majority voting achieve a PPV ranging from 0.978 to 0.996 which is similar to that (0.98) of Shaheen’s study²³ and higher than that (0.904) of Lee’s study³⁴.

Table 3 Comparison of segmentation of human teeth on CBCT using CNN.

Full size table

Segmentation of teeth on whole volume of CBCT remains challenging on 2D U-Net because of the similar Hounsfield units between teeth and bony structures and insufficient spatial information along the perpendicular direction for the input images, i.e., lacking z-axis information in axial slice, y-axis information in coronal slice, and x-axis information in sagittal slice. Solely using axial images as input data, 2Da U-Net tends to predict clusters of tooth root-mimicking bony structures on axial plane false positively. Based on the Eq. (5), the DSC of a slice with any pixel which was predicted as tooth but were out of range of teeth in GT was zero. Accordingly, the overall DSC dropped due to the false positive results of prediction on slices that do not contain any pixel of teeth on GT. These false positive results on 2Da U-Net have two characteristic features, including (1) no specific spatial connection between two clusters along the z-axis and (2) specific tooth root-mimicking geometric shapes, i.e., round or ovoid shapes. Such false positive results could be eliminated or reduced by changing the orientation of the input slices from axial to coronal or sagittal. By choosing coronal slices or sagittal slices as input, 2Dc U-Net and 2Ds U-Net provided abundant z-axis information for the model to recognize the connection of tooth roots and the whole tooth and therefore help eliminate parts of false positive results around the tooth roots. Although the small round or ovoid false positive results on 2Da U-Net were reduced, 2Dc U-Net and 2Ds U-Net had drawbacks by taking the sheet-like bony structures as teeth false positively. The false positive results on 2Da U-Net could also be remedied by providing additional z-axis information in a 3D patch as input data. However, the 3D U-Net produced some different false positive results while reducing those on 2Da U-Net. These false positive results might be attributed to the insufficient and discontinuous information at the edge of each 3D patch.

Majority voting has been used to improve the segmentation performance of anatomic structures on MR images³⁸, conventional CT images³⁹, and CBCT^24,40 by combing the prediction from axial, coronal, and sagittal images. We intentionally applied different voting strategies from five original U-Nets (i.e., 2Da U-Net, 2Dc U-Net, 2Ds U-Net, 2.5Da U-Net, and 3D U-Net) to generate 4 additional virtual U-Nets (i.e., 2.5Dv U-Net, 3.5Dv5 U-Net, 3.5Dv4 U-Net, 3.5Dv3 U-Net) in order to compare the performance of different weighting of majority voting. The 2.5Dv U-Net integrated results from three 2D U-Nets (2Da U-Net, 2Dc U-Net, and 2Ds U-Net) as used in prior studies^24,38,39, while the 3.5D U-Nets integrate these 2D U-Nets together with additional 2.5Da U-Net and 3D U-Net. Our results show that the U-Nets with majority voting (2.5Dv U-Net, 3.5Dv3 U-Net, and 3.5Dv5 U-Net) improved segmentation performance with DSC significantly higher than originally trained U-Nets. By integrating five originally trained U-Nets, the 3.5Dv5 U-Net showed highest DSC, accuracy, specificity, and NPV.

Diminutive noise speckles could be eliminated using mathematical morphology⁴¹. The combination of erode and dilate operators is capable of noise removal by eroding the image with a kernel followed by dilating the image with another kernel. By applying 3D erosion and dilation, our results showed significant changes in segmentation performance, including significantly higher specificity and PPV of all U-Nets, significantly higher DSC and accuracy of all originally trained U-Nets but significantly lower DSC and accuracy of all U-Nets with majority voting, but significantly lower sensitivity and NPV in all U-Nets.

Our study has some limitations to be addressed. First, the sample size of our study is relatively small. Our sample size is similar to that in Li’s study (N = 24), Chen’s study (N = 25)²⁹, Wu’s study (N = 20)³², Wang’s study (N = 28)²⁷, and Duan’s study (N = 30)²⁰. To remedy it, we applied fourfold cross validation to verify our results. Second, the GT was not purely defined by senior dentists but by a third-year resident in periodontology and 3 different junior researchers, leading potential bias in defining the GT of teeth. To remedy it, all GTs were slice-by-slice verified and corrected by a senior neuroradiologist. Third, we did not evaluate interobserver agreement and intraobserver reliability in this study. Further study designed to evaluate the interobserver agreement and intraobserver reliability is warranted to reduce the potential bias occurring in the step of GT generation. Fourth, we did not perform apply any boning box for the teeth in our study. We intentionally used whole volume of CBCT to train and test all U-Nets to compare the segmentation performance of U-Nets with different training strategies not only in the teeth-containing slices but also in slices beyond the levels of teeth. Finally, we did not calculate the volume-based performance matrix as previous studies. By using slice-based performance matrix, our study clearly discloses the pros and cons of different training strategies of U-Nets on the one hand and also allows comparison between our results and others’ results on the other hand. Finally, we did not evaluate the diagnostic performance of the proposed method in any specific dental pathologies although the majority (75%) of patients received CBCT examination in order to evaluate the details of impacted teeth. To evaluate the diagnostic performance of the proposed 3.5D U-Net, further study enrolling specific dental pathology is warranted.

Conclusion

Performance of U-Nets varies among different training strategies for teeth segmentation on CBCT. The segmentation performance of the U-Net can be improved by majority voting and E&D. Overall speaking, the 3.5Dv5 U-Net achieved the best segmentation performance among all U-Nets.

Data availability

The datasets used or analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

2D U-Net:: U-Net using a 2D image as the unit of the input data
2Da U-Net:: U-Net using an axial slice as the unit of the input data
2Dc U-Net:: U-Net using a coronal slice as the unit of the input data
2Ds U-Net:: U-Net using a sagittal slice as the unit of the input data
2.5Da U-Net:: U-Net using three continuous axial slices as the unit of the input data
2.5Dv U-Net:: U-Net integrating the predictions of 2Da U-Net, 2Dc U-Net, and 2Ds U-Net via majority voting
3D U-Net:: U-Net using a cuboid as the unit of the input data
3.5Dv3 U-Net:: U-Net integrating the predictions of 2.5Dv U-Net, 2.5Da U-Net, and 3D U-Net via majority voting
3.5Dv4 U-Net:: U-Net integrating the predictions of 2Da U-Net, 2Dc U-Net, 2Ds U-Net, and 3D U-Net via majority voting
3.5Dv5 U-Net:: U-Net integrating the predictions of 2Da U-Net, 2Dc U-Net, 2Ds U-Net2.5Da U-Net, and 3D U-Net via majority voting
Ac:: Accuracy
CBCT:: Cone beam computed tomography
DLM:: Deep learning model
DSC:: Dice similarity coefficient
E&D:: Erosion and dilation
FN:: False negative
FP:: False positive
GT:: Ground truth
HMDB:: Heavy metallic dental burden
NPV:: Negative predictive value
PPV:: Positive predictive value
Sn:: Sensitivity
Sp:: Specificity
TN:: True negative
TP:: True positive

References

Kamburoglu, K. Use of dentomaxillofacial cone beam computed tomography in dentistry. World J. Radiol. 7(6), 128–130. https://doi.org/10.4329/wjr.v7.i6.128 (2015).
Article PubMed PubMed Central Google Scholar
Gaeta-Araujo, H. et al. Cone beam computed tomography in dentomaxillofacial radiology: A two-decade overview. Dentomaxillofac. Radiol. 49(8), 20200145. https://doi.org/10.1259/dmfr.20200145 (2020).
Article PubMed PubMed Central Google Scholar
Mohammad-Rahimi, H. et al. Deep learning for caries detection: A systematic review. J. Dent. 122, 104115. https://doi.org/10.1016/j.jdent.2022.104115 (2022).
Article PubMed Google Scholar
Agrawal, P. & Nikhade, P. Artificial intelligence in dentistry: Past, present, and future. Cureus 14(7), e27405. https://doi.org/10.7759/cureus.27405 (2022).
Article PubMed PubMed Central Google Scholar
Celik, M. E. Deep learning based detection tool for impacted mandibular third molar teeth. Diagnostics (Basel) https://doi.org/10.3390/diagnostics12040942 (2022).
Article Google Scholar
Zhang, X., Zhu, X. & Xie, Z. Deep learning in cone-beam computed tomography image segmentation for the diagnosis and treatment of acute pulpitis. J. Supercomput. 78, 11245–11264. https://doi.org/10.1007/s11227-021-04048-0 (2022).
Article Google Scholar
Wang, X., Meng, X. & Yan, S. Deep learning-based image segmentation of cone-beam computed tomography images for oral lesion detection. J. Healthc. Eng. 2021, 4603475. https://doi.org/10.1155/2021/4603475 (2021).
Article PubMed PubMed Central Google Scholar
Qiu, B. et al. Robust and accurate mandible segmentation on dental CBCT scans affected by metal artifacts using a prior shape model. J. Pers. Med. https://doi.org/10.3390/jpm11050364 (2021).
Article PubMed PubMed Central Google Scholar
Sabanci, S. et al. Is manual segmentation the real gold standard for tooth segmentation? A preliminary in vivo study using conebeam computed tomography images. Meandros Med. Dent. J. 22, 263–273 (2021).
Article Google Scholar
Kang, H. C., Choi, C., Shin, J., Lee, J. & Shin, Y. G. Fast and accurate semiautomatic segmentation of individual teeth from dental CT images. Comput. Math. Methods Med. 2015, 810796. https://doi.org/10.1155/2015/810796 (2015).
Article PubMed PubMed Central Google Scholar
Luo, D., Zeng, W., Chen, J. & Tang, W. Deep learning for automatic image segmentation in stomatology and its clinical application. Front. Med. Technol. 3, 767836. https://doi.org/10.3389/fmedt.2021.767836 (2021).
Article PubMed PubMed Central Google Scholar
Nagarajappa, A. K., Dwivedi, N. & Tiwari, R. Artifacts: The downturn of CBCT image. J. Int. Soc. Prev. Commun. Dent. 5(6), 440–445. https://doi.org/10.4103/2231-0762.170523 (2015).
Article Google Scholar
Venkatesh, E. & Elluru, S. V. Cone beam computed tomography: Basics and applications in dentistry. J. Istanb. Univ. Fac. Dent. 51(3 Suppl 1), S102–S121. https://doi.org/10.17096/jiufd.00289 (2017).
Article PubMed PubMed Central Google Scholar
Schulze, R. et al. Artefacts in CBCT: A review. Dentomaxillofac. Radiol. 40(5), 265–273. https://doi.org/10.1259/dmfr/30642039 (2011).
Article CAS PubMed PubMed Central Google Scholar
Endo, M., Tsunoo, T., Nakamori, N. & Yoshida, K. Effect of scattered radiation on image noise in cone beam CT. Med. Phys. 28(4), 469–474. https://doi.org/10.1118/1.1357457 (2001).
Article CAS PubMed Google Scholar
Farman, A. G. Guest editorial—Self-referral: An ethical concern with respect to multidimensional imaging in dentistry?. J. Appl. Oral Sci. https://doi.org/10.1590/s1678-77572009000500001 (2009).
Article PubMed Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521(7553), 436–444. https://doi.org/10.1038/nature14539 (2015).
Article ADS CAS PubMed Google Scholar
Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional networks for biomedical image segmentation. Med. Image Comput. Comput.-Assist. Intervent. 9351, 234–241 (2015).
Google Scholar
Yin, X. X., Sun, L., Fu, Y., Lu, R. & Zhang, Y. U-Net-based medical image segmentation. J. Healthc. Eng. 2022, 4189781. https://doi.org/10.1155/2022/4189781 (2022).
Article PubMed PubMed Central Google Scholar
Duan, W., Chen, Y., Zhang, Q., Lin, X. & Yang, X. Refined tooth and pulp segmentation using U-Net in CBCT image. Dentomaxillofac. Radiol. 50(6), 20200251. https://doi.org/10.1259/dmfr.20200251 (2021).
Article PubMed PubMed Central Google Scholar
Li, Q. et al. Automatic tooth roots segmentation of cone beam computed tomography image sequences using U-net and RNN. J. Xray Sci. Technol. 28(5), 905–922. https://doi.org/10.3233/XST-200678 (2020).
Article CAS PubMed Google Scholar
Zhou, H. et al. Ensemble learning and tensor regularization for cone-beam computed tomography-based pelvic organ segmentation. Med. Phys. 49(3), 1660–1672. https://doi.org/10.1002/mp.15475 (2022).
Article PubMed Google Scholar
Shaheen, E. et al. A novel deep learning system for multi-class tooth segmentation and classification on cone beam computed tomography. A validation study. J. Dent. 115, 103865. https://doi.org/10.1016/j.jdent.2021.103865 (2021).
Article PubMed Google Scholar
Minnema, J. et al. Comparison of convolutional neural network training strategies for cone-beam CT image segmentation. Comput. Methods Programs Biomed. 207, 106192. https://doi.org/10.1016/j.cmpb.2021.106192 (2021).
Article PubMed Google Scholar
Juan, C. J. et al. Improving interobserver agreement and performance of deep learning models for segmenting acute ischemic stroke by combining DWI with optimized ADC thresholds. Eur. Radiol. https://doi.org/10.1007/s00330-022-08633-6 (2022).
Article PubMed Google Scholar
Chang, Y. J., Huang, T. Y., Liu, Y. J., Chung, H. W. & Juan, C. J. Classification of parotid gland tumors by using multimodal MRI and deep learning. NMR Biomed. 34(1), e4408. https://doi.org/10.1002/nbm.4408 (2021).
Article PubMed Google Scholar
Wang, H. et al. Multiclass CBCT image segmentation for orthodontics with deep learning. J. Dent. Res. 100(9), 943–949. https://doi.org/10.1177/00220345211005338 (2021).
Article CAS PubMed Google Scholar
Lim, M. & Hacihaliloglu, I. Structure-enhanced local phase filtering using L0 gradient minimization for efficient semiautomated knee magnetic resonance imaging segmentation. J. Med. Imaging (Bellingham) 3(4), 044503. https://doi.org/10.1117/1.JMI.3.4.044503 (2016).
Article Google Scholar
Chen, Y. et al. Automatic segmentation of individual tooth in dental CBCT images from tooth surface map by a multi-task FCN. IEEE Access 8, 97296–97309 (2020).
Article Google Scholar
Fontenele, R. C. et al. Influence of dental fillings and tooth type on the performance of a novel artificial intelligence-driven tool for automatic tooth segmentation on CBCT images—A validation study. J. Dent. 119, 104069. https://doi.org/10.1016/j.jdent.2022.104069 (2022).
Article CAS PubMed Google Scholar
Lahoud, P. et al. Artificial intelligence for fast and accurate 3-dimensional tooth segmentation on cone-beam computed tomography. J. Endod. 47(5), 827–835. https://doi.org/10.1016/j.joen.2020.12.020 (2021).
Article PubMed Google Scholar
Wu, X., Chen, H., Huang, Y., Guo, H., Qiu, T., & Wang, L. Center-sensitive and boundary-aware tooth instance segmentation and classification from cone-beam CT. in 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI). 939–942 (2020).
Cui, Z., Li, C., & Wang, W. ToothNet: Automatic tooth instance segmentation and identification from cone beam CT images. in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA, USA. 6368–6377 (2019).
Lee, S. et al. Automated CNN-based tooth segmentation in cone-beam CT for dental implant planning. IEEE Access 8, 50507–50518 (2020).
Article Google Scholar
Rao, Y. et al. A symmetric fully convolutional residual network with DCRF for accurate tooth segmentation. IEEE Access 8, 92028–92038 (2020).
Google Scholar
Tian, S. et al. Automatic classification and segmentation of teeth on 3D dental model using hierarchical deep learning networks. IEEE Access 7, 84817–84828 (2019).
Article Google Scholar
Xu, X., Liu, C. & Zheng, Y. 3D tooth segmentation and labeling using deep convolutional neural networks. IEEE Trans. Vis. Comput. Graph. 25(7), 2336–2348. https://doi.org/10.1109/TVCG.2018.2839685 (2019).
Article PubMed Google Scholar
Mlynarski, P., Delingette, H., Alghamdi, H., Bondiau, P. Y. & Ayache, N. Anatomically consistent CNN-based segmentation of organs-at-risk in cranial radiotherapy. J. Med. Imaging (Bellingham) 7(1), 014502. https://doi.org/10.1117/1.JMI.7.1.014502 (2020).
Article Google Scholar
Zhou, X., Takayama, R., Wang, S., Hara, T. & Fujita, H. Deep learning of the sectional appearances of 3D CT images for anatomical structure segmentation based on an FCN voting method. Med. Phys. 44(10), 5221–5233. https://doi.org/10.1002/mp.12480 (2017).
Article PubMed Google Scholar
Wang, L. et al. Automated segmentation of dental CBCT image with prior-guided sequential random forests. Med. Phys. 43(1), 336. https://doi.org/10.1118/1.4938267 (2016).
Article PubMed Google Scholar
Jamil, N., Sembok, T. M. T. & Bakar, Z. A. Noise removal and enhancement of binary images using morphological operations. Int. Sympos. Inf. Technol. 2008, 1–6 (2008).
Google Scholar

Download references

Funding

This study was funded by Tri-Service General Hospital (Grant No. TSGH-D-111147), Ministry of Science and Technology, Taiwan (Grant No. 111-2314-B-035-001-MY3, 111-2314-B-039-036), China Medical University Hsinchu Hospital, Taiwan (Grant No. CMUH110-REC3-180).

Author information

These authors contributed equally: Yi-Jui Liu and Chun-Jung Juan.

Authors and Affiliations

Department of Periodontology, School of Dentistry, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
Kang Hsu & Da-Yo Yuh
School of Dentistry and Graduate Institute of Dental Science, National Defense Medical Center, Taipei, Taiwan, ROC
Kang Hsu
Department of Medical Imaging, Xinglong Rd, China Medical University Hsinchu Hospital, 199, Sec. 1, Zhubei, 302, Hsinchu, Taiwan, ROC
Shao-Chieh Lin, Pin-Sian Lyu, Guan-Xin Pan, Yi-Chun Zhuang, Chia-Ching Chang, Cheng-Hsuan Juan & Chun-Jung Juan
Ph.D. Program in Electrical and Communication Engineering, Feng Chia University, Taichung, Taiwan, ROC
Shao-Chieh Lin
Department of Automatic Control Engineering, Feng Chia University, No. 100 Wenhwa Rd., Seatwen, 40724, Taichung, Taiwan, ROC
Pin-Sian Lyu, Cheng-En Juan & Yi-Jui Liu
Master’s Program of Biomedical Informatics and Biomedical Engineering, Feng Chia University, Taichung, Taiwan, ROC
Guan-Xin Pan, Yi-Chun Zhuang, Tung-Yang Lee & Cheng-Hsuan Juan
Department of Management Science, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC
Chia-Ching Chang
Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan, ROC
Hsu-Hsia Peng & Chun-Jung Juan
Cheng Ching Hospital, Taichung, Taiwan, ROC
Tung-Yang Lee & Cheng-Hsuan Juan
Department of Radiology, School of Medicine, College of Medicine, China Medical University, Taichung, Taiwan, ROC
Chun-Jung Juan
Department of Medical Imaging, China Medical University Hospital, Taichung, Taiwan, ROC
Chun-Jung Juan
Department of Biomedical Engineering, National Defense Medical Center, Taipei, Taiwan, ROC
Chun-Jung Juan
Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan, ROC
Chun-Jung Juan

Authors

Kang Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Da-Yo Yuh
View author publications
You can also search for this author in PubMed Google Scholar
Shao-Chieh Lin
View author publications
You can also search for this author in PubMed Google Scholar
Pin-Sian Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Guan-Xin Pan
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Chun Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
Chia-Ching Chang
View author publications
You can also search for this author in PubMed Google Scholar
Hsu-Hsia Peng
View author publications
You can also search for this author in PubMed Google Scholar
Tung-Yang Lee
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Hsuan Juan
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-En Juan
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Jui Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chun-Jung Juan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

K.H., Y.J.L., and C.J.J. formulated the research concept. S.C.L, Y.J.L., and C.J.J. conducted the research design and constructed the study pipeline. K.H., P.S.L, G.X.P, and Y.C.Z defined the ground truth initially. C.J.J. supervised the preparation of ground truth. S.C.L, P.S.L, and Y.J.L conducted the imaging processing and analysis. K.H., C.C.C. and D.Y.Y. contributed to data preparation. H.H.P., Y.J.L., and C.J.J. contributed to data clearing and quality control. T.Y.L., C.H.J., and C.E.J. prepared the figure 1-3 and figure 10-11. K.H., T.Y.L., C.H.J., C.E.J., and C.J.J. wrote the main manuscript. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Yi-Jui Liu or Chun-Jung Juan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Tables.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hsu, K., Yuh, DY., Lin, SC. et al. Improving performance of deep learning models using 3.5D U-Net via majority voting for tooth segmentation on cone beam computed tomography. Sci Rep 12, 19809 (2022). https://doi.org/10.1038/s41598-022-23901-7

Download citation

Received: 26 June 2022
Accepted: 07 November 2022
Published: 17 November 2022
DOI: https://doi.org/10.1038/s41598-022-23901-7

This article is cited by

CBCT-based synthetic CT generated using CycleGAN with HU correction for adaptive radiotherapy of nasopharyngeal carcinoma
- Chen Jihong
- Quan Kerun
- Bai penggang
Scientific Reports (2023)
Tooth automatic segmentation from CBCT images: a systematic review
- Alessandro Polizzi
- Vincenzo Quinzi
- Gaetano Isola
Clinical Oral Investigations (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Automatic mandibular canal detection using a deep convolutional neural network

Artificial intelligence in diagnosing dens evaginatus on periapical radiography with limited data availability

A fully automatic AI system for tooth and alveolar bone segmentation from cone-beam CT images

Introduction

Materials and methods

Patient cohort and CBCT parameters

Imaging preprocessing

Deep learning models (DLMs)

Cross validation and model performance evaluation

Statistical analysis

Results

Comparisons of DSC among U-Nets

Comparisons of accuracy among U-Nets

Comparisons of sensitivity among U-Nets

Comparisons of specificity among U-Nets

Comparisons of PPV among U-Nets

Comparisons of NPV among U-Nets

Case demonstration

Discussion

Conclusion

Data availability

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Tables.

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

CBCT-based synthetic CT generated using CycleGAN with HU correction for adaptive radiotherapy of nasopharyngeal carcinoma

Tooth automatic segmentation from CBCT images: a systematic review

Comments

Search

Quick links