Abstract
Assisted living facilities cater to the demands of the elderly population, providing assistance and support with day-to-day activities. Fall detection is fundamental to ensuring their well-being and safety. Falls are frequent among older persons and might cause severe injuries and complications. Incorporating computer vision techniques into assisted living environments is revolutionary for these issues. By leveraging cameras and complicated approaches, a computer vision (CV) system can monitor residents’ movements continuously and identify any potential fall events in real time. CV, driven by deep learning (DL) techniques, allows continuous surveillance of people through cameras, investigating complicated visual information to detect potential fall risks or any instances of falls quickly. This system can learn from many visual data by leveraging DL, improving its capability to identify falls while minimalizing false alarms precisely. Incorporating CV and DL enhances the efficiency and reliability of fall detection and allows proactive intervention, considerably decreasing response times in emergencies. This study introduces a new Deep Feature Fusion with Computer Vision for Fall Detection and Classification (DFFCV-FDC) technique. The primary purpose of the DFFCV-FDC approach is to employ the CV concept for detecting fall events. Accordingly, the DFFCV-FDC approach uses the Gaussian filtering (GF) approach for noise eradication. Besides, a deep feature fusion process comprising MobileNet, DenseNet, and ResNet models is involved. To improve the performance of the DFFCV-FDC technique, improved pelican optimization algorithm (IPOA) based hyperparameter selection is performed. Finally, the detection of falls is identified using the denoising autoencoder (DAE) model. The performance analysis of the DFFCV-FDC methodology was examined on the benchmark fall database. A widespread comparative study reported the supremacy of the DFFCV-FDC approach with existing techniques.
Similar content being viewed by others
Introduction
Falls are the main reason for severe wounds for aged people globally. Falls block their independent and comfortable life. Statistics exhibit that falls are the leading cause of injury-related death of people of 80 or more. Usually, numerous falls happen at residences owing to potential health risks1. The most general risks include clutter, poor lighting, slippery floors, obstructed ways, pets, and unsteady furniture. The aged people suffering from nerve diseases like dementia and epilepsy are more inclined to fall and fall-associated injuries than the normal-aged populace2. The trend of independent survival of aged people isolated from their family members is also a leading cause behind fall-related victims. Falls in messed up surroundings are central to bleeding, concussions, and other serious health risks main to death3. Owing to independent survival, in the lack of fall recognition technologies, emergency facilities do not respond to fall actions in time. Numerous reinforced surveillance methods are proposed to fill up the requirement of the occurrence of nurses at every time4. It is very complex to create an atmosphere that is entirely fall-proof. Therefore, fall recognition and rescue services can guarantee the protection of the aged populace.
Fall-detection methods aid us in distinguishing falls from non-fall actions so that a warning method to a remote observing point is mechanically produced once the patient falls or permits the utilization of a defensive airbag5. In recent times, numerous models have been projected utilizing dissimilar sensors with changing levels of performance. The most general sensor technologies employed for fall detection are classified into three sensors, namely infrared, wearable, and camera6. Camera and infrared-based sensors are generally costly and are used to record audio-visual signs. Still, their problems are linked to the confidentiality of patients and the static feature of the method. Owing to these restrictions, the wearable sensor provides a cheaper alternative to identify falls depending upon one or many wearable sensors fixed to the consumer’s body; therefore, they can carry it all over the place7. Vision-based fall recognition is a superior alternative that delivers a lower-cost solution for the fall recognition issue. Artificial intelligence (AI), specifically DL technique, is highly effective for this challenge.
Similarly, owing to the enlarged usage of IoT solutions and cameras in familiar places such as bus stands, airports, roads, railway stations, streets, and residences, vision-based models for fall recognition are an excellent selection for the prospect, too8. Also, there are numerous techniques for fall recognition, and the based model is getting advanced. When equated to other approaches, it is not essential for a custom feature extractor in DL. An automated feature extractor is probable in DL-based techniques9. Also, DL is familiar owing to its simplification. A method trained on the database is utilized for a dissimilar issue utilizing transfer learning (TL). The performance of DL-based approaches is excellent when equated to other models10. DL is applied in low-computing edge devices utilizing few-shot learning and TL.
This study introduces a new Deep Feature Fusion with Computer Vision for Fall Detection and Classification (DFFCV-FDC) technique. The primary purpose of the DFFCV-FDC methodology is to employ the CV concept for detecting fall events. Accordingly, the DFFCV-FDC technique uses the Gaussian filtering (GF) approach for noise eradication. Besides, a deep feature fusion process comprising MobileNet, DenseNet, and ResNet models is involved. To improve the performance of the DFFCV-FDC technique, improved pelican optimization algorithm (IPOA) based hyperparameter selection is performed. Finally, the detection of falls is identified using the denoising autoencoder (DAE) model. The performance analysis of the DFFCV-FDC approach has been examined on the benchmark fall database.
Literature review
Durga Bhavani and Ferni Ukrit11 project a novel Inception with a deep CNN-based fall detection and classification (INDCNN-FDC) method. The INDCNN-FDC technique transports dual phases of data preprocessing namely Guided Filtering (GIF) based image smoothing and GF-based image sharpening. Furthermore, the projected INDCNN-FDC method uses the deep TL-based Inceptionv3 method for producing a valuable set of feature vectors. Lastly, the DCNN method gets the feature vector as an input and achieves a fall recognition procedure. Şengül et al.12 proposed a mobile application which collects the hastening and gyroscope sensor facts and moves them to the cloud. In the cloud, a DL model is executed to categorize the action as per the assumed classes. The method also utilizes the Bica cubic Hermite interpolation. For the identification of activity, the study implemented the Bi-LSTM neural network. Kabir et al.13 developed a class ensemble technique that depends on CNN and LSTM methodologies for 3-class falling procedures such as non-fall, pre-fall, and fall, utilizing data from a gyroscope and accelerometer. This technique influences the CNN technique for strong feature extractors from the gyroscope and accelerometer data. Furthermore, LSTM networks the fall procedure’s time-based dynamics. In14, a complete method called TSFallDetect is proposed with a receiving device based on an embedded sensor, a moveable DL methodology organizing platform, and a server likely to collect techniques and data for prospect growth. Likewise, this study employs sequential deep-learning methodologies to forecast falling gestures using data collected from both video and inertial pressure sensors. Additionally, it introduces a new deep learning technique specifically designed for analyzing time-series data in the context of fall prediction.
Mohammad et al.15 projected a theory for a wearable monitoring structure. However, the protest of this model involved the offline study of a collective DNN structure based on an RNN and CNN. The proposed technique used CNN for a robust feature extractor from the gyroscope and accelerometer data and RNN to perfect the time-based dynamic of the falling procedure. The class-based ensemble structure has been utilized, whereas every ensemble method recognized a precise class. In16, a DL-based pre-impact FDS method is offered. To attain this, an automated feature extraction technique that can remove time-based features from all kinds of human fall facts gathered utilizing wearable sensors is proposed. Ong et al.17 presented complete research on fall detection and prediction for reconfigurable stair-accessing machines by leveraging the DL method. The developed architecture incorporates ML models and RNN, specifically LSTM and BiLSTM, for fall recognition of service robots on the set of steps. The fall information is vital for training methods to be produced in an imitation atmosphere. Alabdulkreem et al.18 projected a Chameleon Swarm Algorithm with Improved Fuzzy DL for FD (CSA-IDFLFD) model. The CSA-IDFLFD approach encompasses dual stages of processes. In the early stage, the CSA-IDFLFD approach comprises the project of the IDFL technique for the identification of fall events. Then, in the second stage, the parameters associated with the IDFL approach can be optimally nominated by the strategy of the CSA technique.
Limitations and research gap
The studies cited accentuate several improvements and threats in fall detection systems. Methods, namely Inception with deep CNNs, underline computational intensity and robustness challenges, suggesting a research gap in scalability and real-world applicability under several environmental conditions. Mobile applications collecting sensor data face problems with data transmission delays and privacy concerns, signifying gaps in optimizing data handling and security protocols. Techniques integrating CNN and LSTM models may need extensive training data, revealing a gap in data acquisition and model generalization abilities. Integrating embedded sensors with DL models may face difficulties in real-time data processing and synchronization, underscoring a gap in real-time performance and reliability. Wearable devices for fall monitoring may face user acceptance and sensor reliability issues, necessitating enhancements in usability and technological incorporation. DL-based pre-impact fall detection methods may encounter discrepancies in human motion patterns and real-time processing demands, underscoring gaps in algorithm robustness and effectualness. Lastly, integrating swarm intelligence and fuzzy logic with DL in fall detection systems may present complexities in optimization, accentuating a gap in fusion model integration and performance enhancement across several datasets and scenarios.
Proposed methodology
System architecture
In this study, a novel DFFCV-FDC methodology is proposed. The main drive of the DFFCV-FDC methodology is to employ the CV concept to detect fall events. Accordingly, the DFFCV-FDC methodology has GF using noise elimination, feature fusion process, parameter optimization, and DAE-based fall detection and classification process. Figure 1 signifies the entire flow of the DFFCV-FDC technique.
Noise elimination module
At the primary level, the DFFCV-FDC technique uses the GF approach for noise eradication. Image preprocessing is making the imageries for inference and training19. This is not restricted to orientating, resizing, and colour variations. Moreover, preprocessing decreases training time and speeds up model inference. The Gaussian function was employed to eliminate the noise.
GF: Levelling imageries with GF is highly effective as the procedure of graphic perception, which helps as its base. If neurons examine the visual imageries, they yield a similar filter. The half-tone image was displayed to the right afterward being flattened utilizing a GF at the finale of the imagery. The non‐uniform lower pass filter was defined in Eq. (1).
whereas \(\sigma \) refers to the variance and a mean of \(zero\). \({\mathfrak{R}}_{are}\) signifies the preprocessed image.
DL model architecture
Deep feature fusion process
At this level, a deep feature fusion process comprising MobileNet, DenseNet, and ResNet models is involved.
MobileNet
A CNN architecture known as MobileNet eliminates the need for excessive computational power20. Convolutional has been separated as depthwise and pointwise subtraction by MobileNet. Batch Normalization (BN) and ReLU were utilized by MobileNet structure for depthwise and pointwise convolutional, correspondingly. Usually, the significant difference between the MobileNet and CNN models lies in using a convolutional layer or layer with filter thickness that equals the thickness of input images. MobileNet deploys a separation of convolutional into two distinct functions: depthwise convolutional and pointwise convolutional.
Furthermore, the site deploys Depthwise Separable convolutional that comprises Depthwise and Pointwise layers, then BN and ReLU activation. MobileNet’s model comprises deep, separable convolutions, with the unique of the first layer that is fully convolutional. Describing the network in simple terms permits easy topology exploration to develop the network.
DenseNet
The DenseNet201 technique depends on the new DenseNet structure presented by Huang et al. DenseNet presents the model of dense connections, but all the layers are linked21. This network structure enhances gradient flow, decreases the parameter counts, and encourages feature reuse. DenseNet201 presents this idea by introducing a deeper network comprising 201 layers comprising transition layers and dense blocks.
This structure contains three dense blocks. Transition layers occur between every pair of neighbouring blocks. These transition layers’ roles are vital in changing the feature‐map sizes utilizing convolutional and pooling functions. By integrating these transition layers, this method efficiently accomplishes the data flow and adjusts the feature map sizes to enable effectual learning and data propagation through the network. This method permits DenseNet to control the benefits of dense connections and adaptively fine-tune the feature sizes, enhancing performance in several DL tasks.
The given model architecture is crafted for the classification of real and fake images, with a particular focus on detecting deepfake images. It contains a dense layer, input layer, DenseNet201layer, classification layer, global average pooling layer, and fatten layer. This model includes a total of 19,371,458 parameters, with 19,142,402 being trainable. It is gathered with the Adam optimizer utilizing a learning rate of 0.001. In summary, this design integrates DenseNet201 for feature extractor and dense layers for classification, resulting in a model identifying deepfake images with maximum accuracy.
ResNet
As the layer counts increase, the quality of the weight matrix diminishes, resulting in a reduced ability to learn features. This degradation could lead to the networks being symmetric. He et al.22 presented the ResNet model to resolve the problems of gradient explosion and vanishing effectively. It also mainly improves the training performance and efficiency of DNN, which continues to promote the progress of DL technology. ResNet has been presented with various layers of network models like ResNet_18, ResNet_50, and ResNet_101. During this research, the ResNet34 network can be employed. Primarily, the size of image \(224\text{x}224\) is input into a convolutional layer with a size of convolutional kernel of \(7\text{x7,64}\) convolutional kernels, 2 step size, and three padding, and the output is \(64\text{x}112\text{x}112\). Also, it should be pointed out that there exist two types of linking lines: solid and dashed lines. Solid lines represent the input and output sizes are similar, and the computation process is to combine directly. Figure 2 depicts the framework of ResNet.
Parameter optimization
To improve the performance of the DFFCV-FDC technique, the IPOA-based hyperparameter selection is performed. Pavel Trojovský et al. in 2022 proposed POA a new metaheuristic algorithm that simulates the hunting behavior of pelicans23. The approaching prey (exploration phase) and surface flight (development phase) are two phases of POA.
Initialization
It is necessary to initialize the population before hunting, where the individuals are represented as a candidate solution. This can be mathematically derived as follows:
whereas \({X}_{i,j}\) is the location of \({the i}_{th}\) pelican at \(j\) dimension, \(N\) refers to the number of pelicans population, \(m\) denotes the dimensionality of the problem, and \(rand\) denotes the random integer within [\(\text{0,1}\)]. The upper and lower limitations of the \({J}_{th}\) dimension \(are {u}_{j}\) and \({l}_{j}\), correspondingly.
Exploration stage
Initially, the prey position is generated randomly in the search range, and the pelican determines the prey position. Once the value of the objective function is lesser than the prey, they approach the prey or move away from the prey as follows.
where \({X}_{i}^{{P}_{1}}\) is the location of the \({i}_{th}\) pelican after the update of the initial stage, \(I\) is the 1 or 2 random numbers, \(P\) is the prey location. It is the random value within [0,1], \({F}_{p}\) and \({F}_{i}\) are the fitness values (FVs) of the prey and \({i}_{th}\) pelicans.
The pelican upgrades the location if the FVs of the newest location are superior to the prior place after the pelican approaches the prey.
where \({X}_{i}^{new}\) characterizes the updated location of the \({i}_{th}\) pelican, and \({F}_{i}^{new}\) signifies the FVs of the upgraded location.
Development stage
In this stage, the pelicans capture the prey upon reaching the water surface. Here, they search for the point within the neighbourhood location to accomplish the best convergence.
where \({X}_{i}^{{P}_{2}}\) is the location of \({i}_{th}\) pelican after the update of the second phase, \(R\) represents the constant 0.2 \(and\) denotes the random value within zero and one. T and \(T\) are the existing and maximal iterations, correspondingly.
Crisscross Optimization Algorithm (CSO) is a novel search technique which exploits horizontal and vertical crossovers for updating individuals’ positions in the population. The horizontal crossover is the arithmetical crossover of each dimension.
where \(X\left(i,d\right)\) and \(X\left(j,d\right)\) are the locations of the \(d-\) dimension of the \({i}_{th}\) and \(j\) individuals, correspondingly; \({r}_{1}\) and \({r}_{2}\) are the arbitrary values within [0,1]; and \({C}_{1}\) and \({C}_{2}\) are the arbitrary values within [\(-1\),1]. \(M{S}_{hc}(i,d)\) and \(M{S}_{hc}(i,d)\) are the offspring generated after horizontal crossover.
The vertical crossing is an arithmetical crossing that works on each individual between two dimensions.
whereas \(X\left(i, {d}_{1}\right)\) and \(X\left(i, {d}_{2}\right)\) are the locations of \({d}_{1}\) and \({d}_{2}\) dimensions at \({i}_{th}\) individuals correspondingly, \(r\) denotes the random value within [0,1], and \(M{S}_{vc}(i, d1)\) is the offspring generated after vertical crossover.
The POA gets trapped in the local optima since the pelican individuals move within the smaller range. The CSO is incorporated into the local search range to enhance its capability to escape from the local optima due to local solid development ability and global detection ability. In this work, the existing individuals are far from the random individual once the FVs of the random individuals are lesser than the existing individual. In the CSO, the horizontal crossover is introduced to exploit the randomly generated individual fully, guide them to approach the target position, and optimize the local development proficiency and its capability to escape from the local optimum.
where \(X\left(i,j\right)\), and \(P({i}_{t})\) are the existing and the random individuals, \({r}_{1}\) and \({r}_{2}\) are the randomly generated values within [0,1] and [\(0\),\(2\pi ].\)
The IPOA method is used to derive an FF for obtaining a higher classifier effectiveness. It resolves a positive integer to characterize the superior outcomes of candidate performances. Here, the decay of the classifier error rate has been considered that FF.
Classifier for fall detection
Finally, the detection of falls is identified by utilizing the DAE method. DAE is aimed at preventing the problem of overfitting in AEs and dealing with the comprehensive condition by adding noise to the input layer24. Meanwhile, DAE avoids the repetition of inputs to outputs and learns effective data representations. The DAE-NN has three layers (viz. the hidden (HL), input, and output layers) which perform two major operations, namely encoder and decoder operation.
After adding noise to the in original data, the vector \({x}_{noise}\) is considered the input layer data. Subsequently, an encoder process has been implemented on the input layer to attain \(h\). The last output, z, is performed on the decoder to the HL.
where \(A\) characterizes a random matrix within [0,1]. \({W}_{encoder},\) \({W}_{decoder},\) \({b}_{encoder}\), and \({b}_{decoder}\) are the network parameters for the encoder and decoder implementation. \(f(\cdot )\) and \(g(\cdot )\) are nonlinear activation functions.
The DAE aims to recover the corrupted information into the new information, viz., the last output \(Z\) must be closer to the original data \({x}_{in}\).
Here, 1 indicates the error during the training process, and \(n\) signifies the length of training samples.
Results analysis
Data used
The FD outcomes of the DFFCV-FDC technique are examined using the multiple cameras’ fall (MCF) database25 with frontal sequence and the URFD database26 with the overhead sequence as definite in Table 1. Figure 3 portrays the sample images of MCF and URFD datasets. The MCF dataset encompasses 24 scenarios recorded utilizing 8 IP video cameras. The first 22 scenarios feature falls along with confounding events, while the last 2 scenarios solely depict confounding events. Also, the URFD dataset encompasses 70 sequences, comprising of 30 falls and 40 activities of daily living (ADL). Fall events are captured using 2 Microsoft Kinect cameras along with accelerometer data, while ADL events are recorded utilizing camera 0 and accelerometer data. Sensor data is accumulated employing PS Move devices at 60Hz and x-IMU devices at 256Hz. The suggested DFFCV-FDC technique is simulated using Python 3.6.5 tool on PC i5-8600 k, 250 GB SSD, GeForce 1050Ti 4 GB, 16 GB RAM, and 1 TB HDD. The parameter settings are provided: learning rate: 0.01, activation: ReLU, epoch count: 50, dropout: 0.5, and batch size: 5.
Result analysis on frontal sequence database
Figure 4 establishes the confusion matrices created by the DFFCV-FDC model under a frontal sequence database with numerous epochs. The results state that the DFFCV-FDC model has efficient detection of the fall and no fall instances under all classes.
In Table 2 and Fig. 5, the FD outcome of the DFFCV-FDC approach is reported on the frontal sequence database. The results stated that the DFFCV-FDC approach appropriately recognizes the falls and no falls events. With 500 epochs, the DFFCV-FDC technique gains an average \(acc{u}_{y}\) of 99.68%, \(pre{c}_{n}\) of 99.33%, \(rec{a}_{l}\) of 99.79%, \({F}_{score}\) of 99.56%, and \({G}_{measure}\) of 99.56%. At the same time, with 1000 epochs, the DFFCV-FDC model gets an average \(acc{u}_{y}\) of 98.73%, \(pre{c}_{n}\) of 98.69%, \(rec{a}_{l}\) of 97.76%, \({F}_{score}\) of 98.22%, and \({G}_{measure}\) of 98.22%. Meanwhile, with 1500 epochs, the DFFCV-FDC model attains an average \(acc{u}_{y}\) of 96.82%, \(pre{c}_{n}\) of 96.90%, \(rec{a}_{l}\) of 94.18%, \({F}_{score}\) of 95.45%, and \({G}_{measure}\) of 95.50%. Besides, with 2000 epochs, the DFFCV-FDC approach achieves average \(acc{u}_{y}\) of 98.90%, \(pre{c}_{n}\) of 97.35%, \(rec{a}_{l}\) of 97.35%, \({F}_{score}\) of 97.35%, and \({G}_{measure}\) of 97.35%. Finally, with 3000 epochs, the DFFCV-FDC approach gets average \(acc{u}_{y}\) of 98.41%, \(pre{c}_{n}\) of 98.01%, \(rec{a}_{l}\) of 97.56%, \({F}_{score}\) of 97.78%, and \({G}_{measure}\) of 97.78%.
The performance of the DFFCV-FDC approach is offered in Fig. 6 in the validation accuracy (VALAC) and training accuracy (TRAAC) curves at the frontal sequence database. The figure displays a beneficial interpretation of the behaviour of the DFFCV-FDC technique over numerous epoch counts, representing its learning method and generalized skills. Mainly, the figure concludes a stable development in the TRAAC and VALAC with a development in epochs. It safeguards the adaptive sort of the DFFCV-FDC approach in the pattern recognition procedure on both the data. The arising tendency in VALAC summarizes the ability of the DFFCV-FDC technique to adjust to the TRA data. Also, it delivers precise identification of hidden data, directing out the generalised solid skills.
Figure 7 determines a complete depiction of the validation loss (VALLS) and training loss (TRALS) curves of the DFFCV-FDC technique at the frontal sequence database. The advanced reduction in TRALS highlights the DFFCV-FDC method, enhancing the weights and diminishing the classification error on both data. The figure designates a clean understanding of the DFFCV-FDC method’s link with the TRA data, emphasizing its ability to take patterns within both databases. Remarkably, the DFFCV-FDC approach repeatedly enhances its parameters in declining the changes between the forecast and actual TRA classes.
Inspecting the precision-recall (PR) curve, as presented in Fig. 8, the results guaranteed that the DFFCV-FDC approach gradually achieves higher PR rates over every class on the frontal sequence database. It confirms the superior skills of the DFFCV-FDC approach in the classification of dissimilar classes, presenting proficiency in the recognition of classes.
Besides, in Fig. 9, ROC curves formed by the DFFCV-FDC approach outperformed the identification of different labels on the frontal sequence database. It offers complete thought of the tradeoff among FRP and TPR over discrete detection threshold value and epoch counts. The figure emphasized the boosted classifier results of the DFFCV-FDC approach under all classes and, exactness the efficiency in addressing many identification problems.
In Fig. 10, the comparative outcomes of the DFFCV-FDC approach on the frontal sequence database are described. The outcomes indicate that the 1D-CNN, 2D-CNN, and ResNet101 models have shown the most minor performance with minimal \(acc{u}_{y}\) values of 94.31%, 95.63%, and 96.33%, correspondingly. Besides, the VGG16 and VGG19 models have attained slightly boosted \(acc{u}_{y}\) of 97.66% and 98.25%, respectively. However, the EADL-FDC and IWODL-FDDP models have accomplished closer \(acc{u}_{y}\) of 99.33% and 99.04%, correspondingly. However, the DFFCV-FDC technique performs better with an increased \(acc{u}_{y}\) of 99.68%.
Result analysis on overhead sequence database
Figure 11 establishes the confusion matrices formed by the DFFCV-FDC approach under an overhead sequence database with numerous epochs. The results specify that the DFFCV-FDC approach effectively detects the fall and no fall samples under all classes.
In Table 3 and Fig. 12, the FD outcomes of the DFFCV-FDC method are described on the overhead sequence database. The outcomes stated that the DFFCV-FDC model correctly identifies the falls and no falls events. With 500 epochs, the DFFCV-FDC method achieves an average \(acc{u}_{y}\) of 96.36%, \(pre{c}_{n}\) of 94.95%, \(rec{a}_{l}\) of 95.35%, \({F}_{score}\) of 95.14%, and \({G}_{measure}\) of 95.14%. Simultaneously, with 1000 epochs, the DFFCV-FDC approach gets an average \(acc{u}_{y}\) of 97.02%, \(pre{c}_{n}\) of 95.48%, \(rec{a}_{l}\) of 96.68%, \({F}_{score}\) of 96.06%, and \({G}_{measure}\) of 96.07%. In the meantime, with 1500 epochs, the DFFCV-FDC approach achieves an average \(acc{u}_{y}\) of 97.68%, \(pre{c}_{n}\) of 96.70%, \(rec{a}_{l}\) of 97.12%, \({F}_{score}\) of 96.91%, and \({G}_{measure}\) of 96,91%. Furthermore, with 2000 epochs, the DFFCV-FDC model reaches average \(acc{u}_{y}\) of 98.01%, \(pre{c}_{n}\) of 97.34%, \(rec{a}_{l}\) of 97.34%, \({F}_{score}\) of 97.34%, and \({G}_{measure}\) of 97.34%. Lastly, with 3000 epochs, the DFFCV-FDC methodology obtains an average \(acc{u}_{y}\) of 97.35%, \(pre{c}_{n}\) of 96.45%, \(rec{a}_{l}\) of 96.45%, \({F}_{score}\) of 96.45%, and \({G}_{measure}\) of 96.45%.
The performance of the DFFCV-FDC approach is entirely offered in Fig. 13 in the method of TRAAC and VALAC curves on the overhead sequence database. The figure shows a beneficial interpretation of the behaviour of the DFFCV-FDC method over numerous epoch counts, representing its learning procedure and generalized skills. Mainly, the figure deduces a stable development in the TRAAC and VALAC with development in epochs. It safeguards the adaptive nature of the DFFCV-FDC approach in the pattern recognition procedure on both data. The rising tendency in VALAC summarizes the skill of the DFFCV-FDC technique in adjusting to the TRA data and also surpassing in offering precise identification of hidden data, indicating strong generalization skills.
Figure 14 establishes a comprehensive representation of the TRALS and VALLS outcomes of the DFFCV-FDC approach on the overhead sequence database. The progressive decline in TRALS highlights the DFFCV-FDC technique adjusting the weights and reducing the classification error on both data. The figure directs a clear understanding of the DFFCV-FDC model’s link with the TRA data, emphasizing its ability to take patterns within both databases. Remarkably, the DFFCV-FDC method repeatedly enhances its parameters in declining the changes between the forecast and actual TRA class labels.
Inspecting the PR curve, as exposed in Fig. 15, the results certified that the DFFCV-FDC model gradually accomplishes enhanced PR values under every class on the overhead sequence database. It confirms the improved skills of the DFFCV-FDC approach in identifying distinct courses and exhibiting the ability to recognize classes.
In addition, in Fig. 16, ROC curves produced by the DFFCV-FDC model are beaten in the identification of dissimilar labels on the overhead sequence database. It offers complete acceptance of the tradeoff among FRP and TPR over discrete detection threshold value and epoch counts. The outcome emphasized the superior classifier results of the DFFCV-FDC technique under every class, outlining the efficacy in finding many identification problems.
In Fig. 17, the comparative outcomes of the DFFCV-FDC approach on the overhead sequence database are described. The results specify that the 1D-CNN, 2D-CNN, and ResNet101 methodologies have shown the most minor performance with the least \(acc{u}_{y}\) values of 92.69%, 95.48%, and 96.69%, respectively. Moreover, the VGG16 and VGG19 techniques have slightly increased \(acc{u}_{y}\) of 95.16% and 96.56%, respectively. However, the EADL-FDC and IWODL-FDDP models have accomplished nearer \(acc{u}_{y}\) of 97.34% and 97.02%, correspondingly. However, the DFFCV-FDC methodology grasps a higher solution with an increased \(acc{u}_{y}\) of 98.34%.
Hence, the DFFCV-FDC technique can be applied for enhanced fall recognition results.
Conclusion
In this study, a novel DFFCV-FDC methodology is introduced. The main drive of the DFFCV-FDC methodology is to employ the CV concept to detect fall events. Accordingly, the DFFCV-FDC methodology has GF using noise elimination, feature fusion process, parameter optimization, and DAE-based fall detection and classification process. At the primary level, the DFFCV-FDC technique uses the GF approach for noise eradication. Besides, a deep feature fusion process comprising MobileNet, DenseNet, and ResNet models is involved. The IPOA-based hyperparameter selection is performed to improve the performance of the DFFCV-FDC methodology. Finally, the detection of falls is identified by utilizing the DAE model. The performance analysis of the DFFCV-FDC approach was examined on the benchmark fall database. A widespread comparative study reported improving the DFFCV-FDC technique over existing models. The limitations of the DFFCV-FDC technique include the requirement for robustness testing across varied real-world camera settings and lighting conditions to confirm consistent performance. Future work may focus on improving the model’s sensitivity to subtle fall cues, incorporating more sophisticated anomaly detection techniques, and addressing privacy concerns associated with continuous video surveillance in private spaces. Furthermore, exploring real-time implementation threats and optimizing computational effectualness for deployment on edge devices is significant for practical application in assisted living and healthcare environments.
Data availability
The datasets used and analyzed during the current study available from the corresponding author on reasonable request.
References
Lotfi, A. et al. Supporting independent living for older adults; employing a visual-based fall detection through analyzing the motion and shape of the human body. IEEE Access 6, 70272–70282 (2018).
Lin, C. B. et al. A framework for fall detection based on openpose skeleton and lstm/gru models. Appl. Sci. 11(1), 329 (2020).
Sundaram, B. M., Rajalakshmi, B., Mandal, R. K., Nair, S. & Choudhary, S. S. Fall detection among elderly using deep learning. In 2023 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE) (eds Sundaram, B. M. et al.) (IEEE, 2023).
Abdo, H., Amin, K. M. & Hamad, A. M. Fall detection based on RetinaNet and MobileNet convolutional neural networks. In 2020 15th International Conference on Computer Engineering and Systems (ICCES) (eds Abdo, H. et al.) (IEEE, 2020).
Liu, Y. H. et al. Automatic fall risk detection based on imbalanced data. IEEE Access 9, 163594–163611 (2021).
Ahamed, F., Shahrestani, S. & Cheung, H. Privacy-aware IoT Based fall detection with infrared sensors and deep learning. In International Conference on Interactive Collaborative Robotics (eds Ahamed, F. et al.) (Springer, 2023).
Alarifi, A. & Alwadain, A. Killer heuristic optimized convolution neural network-based fall detection with wearable IoT sensor devices. Measurement 167, 108258 (2021).
Rezaee, K., Khosravi, M. R., Neshat, N. & Moghimi, M. K. Deep transfer learning-based fall detection approach using IoMT-enabled thermal imaging-assisted pervasive surveillance and big health data. J. Circ. Syst. Comput. 31(12), 2240005 (2022).
El Zein, H., Mourad-Chehade, F. & Amoud, H. Leveraging Wi-Fi CSI data for fall detection: a deep learning approach. In 2023 5th International Conference on Bio-engineering for Smart Technologies (BioSMART) (eds El Zein, H. et al.) (IEEE, 2023).
Lu, N. et al. Deep learning for fall detection: Threedimensional CNN combined with LSTM on video kinematic data. IEEE J. Biomed. Health. Inf. 23(1), 314–323 (2018).
Durga Bhavani, K. & Ferni Ukrit, M. Design of inception with deep convolutional neural network-based fall detection and classification model. Multimed. Tools Appl. 83(8), 23799–23817 (2024).
Şengül, G., Karakaya, M., Misra, S., Abayomi-Alli, O. O. & Damaševičius, R. Deep learning based fall detection using smartwatches for healthcare applications. Biomed. Sig. Proc. Control 71, 103242 (2022).
Kabir, M. M., Shin, J. & Mridha, M. F. Secure Your Steps: A Class-Based Ensemble Framework for Real-Time Fall Detection Using Deep Neural Networks (IEEE Access, 2023).
Qu, Z., Huang, T., Ji, Y. and Li, Y., 2024. Physics Sensor Based Deep Learning Fall Detection System. arXiv preprint arXiv:2403.06994.
Mohammad, Z., Anwary, A. R., Mridha, M. F., Shovon, M. S. H. & Vassallo, M. An enhanced ensemble deep neural network approach for elderly fall detection system based on wearable sensors. Sensors 23(10), 4774 (2023).
Jain, R. & Semwal, V. B. A novel feature extraction method for a pre-impact fall detection system using deep learning and wearable sensors. IEEE Sens. J. 22(23), 22943–22951 (2022).
Ong, J. H., Hayat, A. A., Gomez, B. F., Elara, M. R. & Wood, K. L. Deep learning based fall recognition and forecasting for reconfigurable stair-accessing service robots. Mathematics 12(9), 1312 (2024).
Alabdulkreem, E. et al. Chameleon swarm algorithm with improved fuzzy deep learning for fall detection approach to aid elderly people. J. Disabil. Res. 2(2), 62–70 (2023).
Kaur, A. et al. Cotton crop classification using satellite images with score level fusion based hybrid model. Pattern Anal. Appl. 27(2), 1–22 (2024).
Tolba, A. & Talal, N. Brain tumor classification using deep learning models under neutrosophic environment. Inf. Sci. Appl. 2, 77–91 (2024).
Jaiswal, A., Gianchandani, N., Singh, D., Kumar, V. & Kaur, M. Classification of the COVID-19 infected patients using DenseNet201 based deep transfer learning. J. Biomol. Struct. Dyn. 39(15), 5682–5689 (2021).
Yang, W., Yuan, Y., Zhang, D., Zheng, L. & Nie, F. An effective image classification method for plant diseases with improved channel attention mechanism aECAnet based on deep learning. Symmetry 16(4), 451 (2024).
Zhang, Y. & Li, H. Research on economic load dispatch problem of microgrid based on an improved pelican optimization algorithm. Biomimetics 9(5), 277 (2024).
Wang, J. et al. MDGN: Circuit design of memristor-based denoising autoencoder and gated recurrent unit network for lithium-ion battery state of charge estimation. IET Renew. Power Gener. 18(3), 372–383 (2024).
E. Auvinet, C. Rougier, J. Meunier, A. S. Arnaud and J. Rousseau, “Multiple cameras fall dataset,” DIROuniversité de montréal, Montreal, QC, Canada, tech. Rep. 1350,” 2010.
UR Fall Detection (URFD) dataset with an overhead sequence (available at http://fenix.univ.rzeszow.pl/~mkepski/ds/uf.html).
Acknowledgments
The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through Large Research Project under grant number RGP2/32/45. Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R77), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The authors extend their appreciation to the Deanship of Scientific Research at Northern Border University, Arar, KSA for funding this research work through the project number “NBU-FFR-2024-2248-08". This study is partially funded by the Future University in Egypt (FUE).
Author information
Authors and Affiliations
Contributions
Conceptualization: Wafa Sulaiman Almukadi Data curation and Formal analysis: Fadwa Alrowais, Radwa Marzouk Investigation and Methodology: Wafa Sulaiman Almukadi Project administration and Resources: Supervision; Abdulsamad Ebrahim Yahya, Radwa Marzouk Validation and Visualization: Muhammad Kashif Saeed, Ahmed Mahmud Writing—original draft, Wafa Sulaiman Almukadi Writing—review and editing, Abdulsamad Ebrahim Yahya All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Almukadi, W.S., Alrowais, F., Saeed, M.K. et al. Deep feature fusion with computer vision driven fall detection approach for enhanced assisted living safety. Sci Rep 14, 21537 (2024). https://doi.org/10.1038/s41598-024-71545-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-71545-6
Keywords
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.