Abstract
In recent studies, artificial intelligence and machine learning methods give higher accuracy than other prediction methods in large data sets with complex structures. Instead of statistical methods, artificial intelligence, and machine learning are used due to the difficulty of constructing mathematical models in multi-parameter and multivariate problems. In this study, predictions of length–weight relationships and meat productivity were generated by machine learning models using measurement data of male and female crayfish in the narrow-clawed crayfish population living in Apolyont Lake. The data set was created using the growth performance and morphometric characters from 1416 crayfish in different years to determine the length–weight relationship and length-meat yield. Statistical methods, artificial intelligence, and machine learning are used due to the difficulty of constructing mathematical models in multi-parameter and multivariate problems. The analysis results show that most models designed as an alternative to traditional estimation methods in future planning studies in sustainable fisheries, aquaculture, and natural sources management are valid for machine learning and artificial intelligence. Seven different machine learning algorithms were applied to the data set and the length–weight relationships and length-meat yields were evaluated for both male and female individuals. Support vector regression (SVR) has achieved the best prediction performance accuracy with 0.996 and 0.992 values for the length–weight of males and females, with 0.996 and 0.995 values for the length-meat yield of males and females. The results showed that the SVR outperforms the others for all scenarios regarding the accuracy, sensitivity, and specificity metrics.
Similar content being viewed by others
Introduction
Freshwater lobsters, also known as crayfish, which are one of the largest inland forms of decapod crustaceans, that contain economically important species, are represented by 737 species and subspecies in the world1,2. The present species in Türkiye is genetically defined as a Pontastacus leptodactylus and has different subspecies2,3,4. Crayfish production is done through catching and breeding in the world. Despite a large number of species, catching and breeding activities generally focused on the species of only three families (Cambaridae, Parastacidae, Astacidae) that are economically important. The amount of crayfish production with catching was determined as 15,426 tons, excluding China, as of 2015, and Armenia ranks first with a production of 7380 tons. Considering the production amounts based on species, the most produced crayfish by catching is Pontastacus leptodactylus. Through aquaculture, 787,373 tons of crayfish have been produced in the world. China ranks first with 723,200 tons of production. Procambarus clarkii is the leading species produced by aquaculture with 786,905 tons5. In recent years, crayfish plague, overfishing, and increased water pollution caused fluctuations in crayfish production. Stock management and alternative production methods should be improved and developed due to fluctuating trends in volumes of crayfish production and the available data should be used carefully and successfully. In other words, crayfish producers must give data for processing with new technologies in this way could help decision-makers determine the right decisions and strategies for the future. However, it is not possible to manually process and analyze very large amounts of data. Many different ways in machine learning involve patterns in which relationships can be established; these determine the technique that can be used to make sense of the output from the data. The most commonly used machine learning methods in the literature are; artificial neural networks6,7,8, logistic regression8,9, fuzzy modeling10, genetic algorithms and programming11, decision tree7,9, Bayesian network approach12,13,14, random forest9,15, support vector machine9,16. Regression-based modeling techniques are widely used to estimate species distribution and water quality17 such as generalized additive models (GAMs), generalized linear models (GLM), classification and regression trees (CART), and multivariate adaptive regression splines (MARS). Also, a modeling approach that integrates a functional network approach with a dynamic Bayesian network model was used to predict trends of different fish and zooplankton species from specific fishing, temperature, and net primary production (Net PP) scenarios18. Similarly, Hamilton et al.13, a Bayesian network approach to developing a habitat suitability model, and Lin et al.12 used a Bayesian analysis to account for the combined uncertainty and variability of parameters in the crayfish bioaccumulation model. Some of the growth parameters of Tigris loach (Oxynoemacheilus tigris) were estimated by using both length–weight relationship and artificial neural network (ANN) between 2014 and 2015 from 14 different stations in Karasu and the two methods were compared with each other. It has been observed that there is a high affinity between the measured and predicted data and the values obtained with ANN are closer to the real values19. Also, Ozcan20 pointed out that ANN can be used as an alternative method for the estimation of population.
The study aims to reveal and predict the future status of the population by determining the meat yield and growth performance of narrow-clawed crayfish living in Lake Apolyont in different years for sustainable aquaculture and fishing. One of the goals of this study is to contribute to future research in Apolyont Lake, which is in an important market like Europe. In this context, this study consists of two modules. The first is to apply data preprocessing techniques to all data sets. In the second, random forest regression (RFR), gradient boosting regression (GBR), decision tree regression (DTR), multilayer perceptron regression (MLPR), support vector regression (SVR), linear regression (LR) and K-nearest neighbors regression (K-NNR) machine learning methods are executed all together on the dataset. For this, various machine learning algorithms were applied to the data set, and the best estimation performance of the total length–weight and meat yield of narrow-clawed crayfish was achieved.
Materials and methods
Study area and data collection
The dataset used in this study is created by using narrow-clawed crayfish data obtained from Apolyont Lake in Balıkesir-Türkiye. Map of the study area generated using ArcGIS Desktop version 10.8 (https://www.esri.com/en-us/arcgis/products/arcgis-desktop/overview) (Fig. 1). An individual of 1461 narrow-clawed crayfish, consisting of 573 (40%) females and 843 (60%) males, was caught. Each sample has 22 attributes, and these attributes are presented in Table 1. Length measurements of body parts of crayfish are used to determine morphological differences between male and female crayfish among species15,16,18. These measurements are used to determine the comparative growth of populations, the size of crayfish to be put on the market, meat yield, and systematic separation. Length measurements of narrow-clawed crayfish broodstock and juvenile individuals were made with a digital caliper with 0.01 mm precision. Length measurements are based on the total length from the rostrum point to the telson point. In the measurement of the weight of narrow-clawed crayfish, the weight of the broodstock was measured with a 0.01 g precision weighing, while the weight of the juvenile individuals was measured with a 0.0001 g precision scale21.
Length–weight relationship and meat yield
Regression analysis is generally used to determine the relationship between body length and weight of crustaceans22. As in fish, there is a nonlinear relationship between length and weight in the form of Eq. (1) in crayfish. If the logarithms of both sides are taken in this equation, the length–weight relationship becomes linear as Eq. (2) 22,23.
Abbreviations: L, total length (TL); W, total weight (TW); a and b, constant parameters of the equation.
The length–weight relationship of narrow-clawed crayfish was investigated in terms of the total length (TL)–total weight (TW) relationship. Accordingly, regression equations, curves, and correlation coefficients were calculated. While checking the significance of the calculated b value, the test statistic value was calculated using Eq. (3).
Abbreviations: Sx, standard deviation of log (L) values; Sy, standard deviation of log (W) values); n, the number of individuals used in the calculation; r2, coefficient of determination of the log (L) and log (W) values.
To determine the meat yield, the abdomen, claws, and scissors were cut with the help of a scalpel, and the meats inside were directly weighed and their weights were determined24.
Machine learning techniques
Machine learning is the general name of computer algorithms that can learn the solution to a problem, handled with complex pattern detection and data-based decision-making features25. Linear regression is one of the most widely used machine learning algorithms. Linear regression is a modeling method that aims to establish a linear relationship between one or more independent variables and a dependent variable or a numerical result. Therefore, this method models the relationships between dependent variables and independent variables, from analysis and learning to current educational outcomes26. Classification is distributing data to classes in a data set according to their attributes. Classification algorithms analyze the relationships between class labels and other features in a given training set27. The success of the model is determined by deciding which class the new item belongs to and testing it with the help of this model. In this study, the modeling and prediction of population growth of crayfish in Lake Apolyont were made using popular machine learning algorithms random forest regression (RFR), gradient boosting regression (GBR), decision tree regression (DTR), multilayer perceptron regression (MLPR), support vector regression (SVR), linear regression (LR) and K-nearest neighbors regression (K-NNR). Their results have been compared with each other for evaluation of achievement.
RFR is a collection of decision trees, each independent of the other and based on a random sample of training data using the same distribution. This method creates many decision trees during the training, and then, during the estimation, the classification of these decision trees is used, and the class of the input is decided by a majority vote. RF regression is discrete in that it uses multiple decision trees to produce better-fitting models and make accurate predictions. This causes it to produce the same results for the desired predictions within a between range9,27,28. GBR is a machine learning algorithm and a model developed to improve the prediction of decision trees. It is an algorithm developed by Friedman29. According to the GBR algorithm, a prediction function is first created in the first iteration and these functions are called trees. While creating the next tree, the error rates of the trees created before are kept in memory. The difference between the estimates and the observations is calculated and a loss function is obtained from these differences. In the second iteration, the difference between the predictions and the observations is calculated by combining the prediction and loss functions. Thus, it is tried to increase the success of the estimation function by adding it continuously and it is ensured that the error rate approaches zero. DTR observes the properties of an object and trains a model in the structure of a tree to predict future data to produce meaningful continuous output. Continuous output means that the output/result is not discrete, i.e., only represented by a discrete, known set of numbers or values27,30. MLPR is a neural network with one or more hidden layers between the input layer and the output layer. MLPR can classify nonlinear data through several hidden layers and nonlinear enable functions such as ReLU and tanh. In particular, regression analysis using MLPR does not require the assumption of a statistical relationship between independent and dependent variables. For this reason, MLPR is widely used as an algorithm for regression in various fields31,32. SVR is a support vector machine (SVM) implementation that generates an actual number as output. SVM can be applied to regression problems by importing an alternative loss function. SVR is built on the principle of inherent risk minimization to solve complex problems33,34,35. LR is a supervised learning algorithm that uses a parametric model and a linear approach for a prediction problem 9,27,36. K-NNR for machine learning is well known to have been introduced as a non-parametric approach used to classify fields and perform regressions. Within these two areas, the input data contains the closest training examples in the feature area. In K-NNR, the output is the property value of the object. This value is the average of the values of the k-nearest neighbors36,37. Various hyperparameters that make the machine learning methods used flexible have been tested and optimized.
Data preprocessing, which will increase the quality of raw data to be used in the study, is one of the most important processes that have a direct positive effect on computer science and the performance of all algorithms. Since machine learning algorithms are generally data-driven structures, various operations such as cleaning, scaling, reducing, and normalization have significant effects on prediction accuracy38,39. In this study, after the raw data were arranged and determined, they were subjected to a normalization process with min–max normalization. The applied normalization formula is given in Eq. (4).
In this formula, each input (xi) value is linearly normalized (xn) between 0 and 1 by finding the minimum (xmin) and maximum (xmax) values of the raw data set27,39. Of the available data, 70% was used for training the machine learning model and 30% for testing it, because the 70–30 train split helps train the model adequately on various datasets, which provides better generalization and performance evaluation40,41. The flow chart of the proposed method in this study is given in Fig. 2.
The models were made using a desktop on the operating system Windows 10 Pro operating system with the following hardware configuration: Intel (R) Core (TM) i7-7700HQ, 2.80 GHz processor speed, and 16.0 GB of RAM. Python language was used in the present study.
Performance metrics
R-squared (coefficient of determination) (R2), root mean square error (RMSE), mean absolute error (MAE), relative absolute error (RAE), and root relative square error (RSE) values were used to measure the estimation performance of the models in the study. Using these metrics, it can be decided which technique is most suitable for this data set. In linear regression, R2 [Eq. (5)] is a measure of how close the data points are to the fit line. It is also known as the coefficient of determination. The R2 is a metric that represents prediction performance for regression models. It is a positive value between 0 and 1. The ideal value for R2 is 1. The closer the R2 value is to 1, the better the model. RMSE [Eq. (6)], prediction errors, is a measure of how far the regression line is from the data points. MAE [Eq. (7)] is the error rate of the growth forecast model. RAE [Eq. (8)] takes the total absolute error and normalizes it by dividing it by the simple estimator's total absolute error. RSE [Eq. (9)] gives the square root of the sum of the squares of the differences between the estimated value and the true value to the sum of the squares of the differences between the true values and the mean value42. The lower the calculated values for the four error metrics and the closer the coefficient of determination is to 1, the more accurate the results are. These five different metrics are not sufficient alone in terms of the calculated values but are meaningful when evaluated together.
Results
Length–weight relationships and meat yields are used to compare the characteristics of different crayfish populations. In this study, the weights of crayfish according to gender are associated with their lengths. It was seen that the crayfish ranged between 23 and 71 mm carapace lengths. The carapace length has measured a minimum of 23 maximum of 70 mm in females, a minimum of 28 maximum of 71 mm in males, and an average of 44 mm in all individuals. When the weight distributions of the crayfish were examined, it was determined that the live weight was between 2.5 and 92.4 g. The weight of female individuals ranged between 2.5 and 72.4 g, and the weight of male individuals ranged between 2.5 and 92.4 g. The average total weight measured was 20.0 g in females, and 21.9 g in males. Although there was a linear relationship between meat amounts and carapace lengths in male and female crayfish, it was determined that this relationship was stronger in male crayfish. Total meat yield was calculated as 16.45% on average in the examinations performed on a total of 1416 individuals. CL-TW relationship graphs were drawn for the entire population, including female, male, and female–male mixed (Fig. 3).
Seven different machine learning algorithms were run on the training data and the total weight performance metric results according to the length measurements of male individuals are given in Table 2 and the results of female individuals are given in Table 3.
Likewise, the abdominal meat yield performance metric results according to the length measurements of male and female individuals are given in Tables 4, 5, respectively.
The best length–weight and length-meat yield performance metric results for all individuals were found with SVR. According to the SVR, accuracy levels were found in the length–weight prediction results of males (R2 = 0.996; RMSE = 0.979) and females (R2 = 0.992; RMSE = 0.966), and the length-meat yield prediction results of males (R2 = 0.996; RMSE = 0.098) and females (R2 = 0.995; RMSE = 0.977) were 99.6%, 99.2%, 99.6%, and 99.5%, respectively. The lowest accuracy levels were found with MLPR for female length–weight, male and female length-meat yield performance metric results. On the other hand, in male length–weight performance metric results, the lowest accuracy level was found with DTR.
Discussion
Knowing the population’s current status is important in determining the fishing strategies in fisheries management. The basic idea is to make estimates of population size at regular intervals. This is a tiring job that requires money and effort; however, the change in the population is observed by doing it at least every few years. Because population size depends on time in a certain period, death, reproduction, migration, growth, catching, abundance of food, predators, etc. changes in number and weight with the effect of factors43,44. Stock monitoring studies that need to be carried out uninterruptedly for the sustainability of populations and the determination of growth characteristics in population parameters are important for the evaluation of the population. Length differences between body parts are used to show morphological changes between male and female individuals of crayfish species. These differences are also used to determine the relative growth of crayfish populations and to compare populations of the same species. The meat yield of individuals in the population is an important parameter used in population estimation under extensive conditions. The exact determination of the population size is possible by catching all individuals. Since it is impossible to do this in practice, the population size is determined with the help of the population parameters explained here. Therefore, it was aimed to determine these growth characteristics of crayfish in this study.
Machine learning provides a neutral approach to recognizing unknown interactions and deriving predictions that have the potential to aid in meaningful feature selection. The correlation values produced as a result of the algorithm estimates show the correlation between length–weight and length-meat yield measurement values and length–weight and length-meat yield prediction values. Since this study is based on the properties that are effective in the correct estimation of the length–weight and length-meat yield values with the determination of the correct length–weight and length-meat yield prediction, they are evaluated on the positive correlation values formed as a result of the estimates of the algorithms. Because the proximity of correlation values to + 1 indicates the closeness of length–weight and length-meat yield estimates to length–weight and length-meat yield measurements.
In the present study, SVR performance metric results are 0.996 and 0.992 values both for the length–weight of male and female individuals, with 0.996 and 0.995 values and for the length-meat yield of males and females are closer to the values of measurement. SVR stands out as the method with the best R2 and error values. SVR is both linear and non-linear35 due to this feature, it is considered that the SVR provides better performance in evaluating the population structure. Similarly, Benzer et al.8, and Benzer and Benzer45, showed that ANN could be a superior estimation tool compared to the length–weight for the growth predictions of narrow-clawed crayfish in Hirfanlı Dam Lake and Uluabat Lake, respectively. In addition, SVM gave 80% accuracy in classifying sea bream feeding trials according to hematological blood parameters. In addition, SVM provided 80% accuracy in classifying sea bream (Sparus aurata) feeding trials according to hematological blood parameters9, but K-NN gave better results than SVM in Alburnus tarichi population analysis35. All these studies showed that artificial intelligence applications such as ANN and machine learning etc. should be specific to data. Before evaluating a population in the long term, the first thing to do is to determine which application is appropriate for that population.
The results of this study showed that SVR is the most successful application to evaluate the crayfish population living in Apolyont Lake. To better evaluate the results, the prediction and actual values were given for male length–weight in Fig. 4, for female length–weight in Fig. 5, for male length-meat yield in Fig. 6, and for female length-meat yield in Fig. 7.
Ecological factors determining the presence of white-clawed crayfish (Austropotamobius pallipes) were evaluated using SVM, a machine-learning method. It was determined that the models without feature selection, the models created by applying Goldberg’s genetic algorithm after the feature selection, and the models built after selecting inputs using the four supervised-filter evaluators had a classification accuracy degree of 70.84%, 73.92%, 76.62%, respectively16. The applied model in this study has higher accuracy as well as found closer to the real values. Thus, the study results of Zelaya38 were supported and the accuracy of the results obtained from this study was observed. Besides, Tirelli et al.7 used logistic regression, decision tree models, and artificial neural networks to manage data on the presence/absence of native white-clawed crayfish. They obtained better performance from logistic regression and decision tree models using the artificial neural network model. However, a hybrid three-dimensional (3D) dissolved oxygen content estimation model based on a radial basis function neural network, K-means, and subtractive clustering effectively demonstrated the three-dimensional distribution in predicting changes in dissolved oxygen content on crap pools46. Many crayfish cannot be sold at the end of the sales period due to errors made in production planning, growth estimation, transportation, grading processes, and the application of the appropriate stock density. This situation causes a loss of resources, energy, and capital in both the production and sales stages. In the present study, it was seen that due to the multivariate nature of predicting the growth rate of crayfish populations, instead of creating hybrid models, it can be done by using machine learning methods that give faster and more accurate results in complex structured data sets.
Conclusion
This is the first study to use machine learning methods to predict the total length-total weight and total length-meat yield performance of the narrow-clawed crayfish population of Apolyont Lake. The results show that machine learning methods can predict population growth rates and size in fisheries and aquatic populations. Among these seven machine learning applications implemented, SVR gave the best performance in terms of both R2 value and error metrics. In the study, it was seen that the values obtained by machine learning have high performance and are closer to the real values. The experimental results of the study show that the proposed method can be used as an effective population measurement estimation tool. Moreover, this article, which presents a series of empirical analyses with the results discussed, is considered to be valuable for professionals of natural resource management, fisheries, aquaculture, and their sustainability.
Data availability
All data that support the findings of this study are included within this paper and its Supplementary Information files.
References
Crandall, K. A. & Buhay, J. E. Global diversity of crayfish (Astacidae, Cambaridae, and Parastacidae-Decapoda) in freshwater. Hydrobiologia 595(1), 295–301. https://doi.org/10.1007/s10750-007-9120-3 (2008).
Crandall, K. A. & De Grave, S. An updated classification of the freshwater crayfishes (Decapoda: Astacidea) of the world, with a complete species list. JCB 37(5), 615–653. https://doi.org/10.1093/jcbiol/rux070 (2017).
Akhan, S., Bektas, Y., Berber, S. & Kalayci, G. Population structure and genetic analysis of narrow-clawed crayfish (Astacus leptodactylus) populations in Turkey. Genetica 142, 381–395. https://doi.org/10.1007/s10709-014-9782-5 (2014).
Berber, S., Kale, S., Bulut, M. & İzci, B. A study on determining the ideal stock density of freshwater crayfish (Pontastacus leptodactylus) in polyculture with rice (Oryza sativa L.). KSU J. Agric. Nat. 22(6), 953–964 (2019).
FAO. Fishery and Aquaculture Statistics. Global capture production 1950–2015 (FishStatJ). In: FAO Fisheries and Aquaculture Department. https://ww.fao.org/fishery/statistics/software/fishstatj/en. (2017).
Suryanarayana, I. et al. Neural networks in fisheries research. Fish. Res. 92(2–3), 115–139. https://doi.org/10.1016/j.fishres.2008.01.012 (2008).
Tirelli, T., Favaro, L., Gamba, M. & Pessani, D. Performance comparison among multivariate and data mining approaches to model presence/absence of Austropotamobius pallipes complex in Piedmont (North Western Italy). C. R. Biol. 334(10), 695–704. https://doi.org/10.1016/j.crvi.2011.07.002 (2011).
Benzer, S., Benzer, R. & Günal, A. Ç. Artificial neural networks approach in morphometric analysis of crayfish (Astacus leptodactylus) in Hirfanlı Dam Lake. Biologia 72(5), 527–535. https://doi.org/10.1515/biolog-2017-0052 (2017).
Gültepe, Y. & Gültepe, N. Preliminary study for the evaluation of the hematological blood parameters of seabream with machine learning classification methods. IJA 72, 1–10 (2020).
Zuther, S., Schulz, H. K., Lentzen-Godding, A. & Schulz, R. Development of a habitat suitability index for the noble crayfish Astacus astacus using fuzzy modelling. Bull. Fr. Peche Piscic. 376–377, 731–742 (2005).
Luna, M., Lorente, I. & Cobo, A. Determination of feeding strategies in aquaculture farms using a multiple-criteria approach and genetic algorithms. Ann. Oper. Res. https://doi.org/10.1007/s10479-019-03227-w (2019).
Lin, H. et al. A Bayesian approach to parameter estimation for a crayfish (Procambarus spp.) bioaccumulation model. Environ. Toxicol. Chem. 23(9), 2259–2266. https://doi.org/10.1897/03-303 (2004).
Hamilton, S. H., Pollino, C. A. & Jakeman, A. J. Habitat suitability modelling of rare species using Bayesian networks: Model evaluation under limited data. Ecol. Model. 299, 64–78. https://doi.org/10.1016/j.ecolmodel.2014.12.004 (2015).
Trifonova, N., Maxwell, D., Pinnegar, J., Kenny, A. & Tucker, A. Predicting ecosystem response to changes in fisheries catch, temperature and primary productivity with a dynamic Bayesian network model. IJMS 74(5), 1334–1343 (2017).
Adibi, P. et al. Predicting fishing effort and catch using semantic trajectories and machine learning. In Multiple-Aspect Analysis of Semantic Trajectories: First International Workshop, MASTER 2019, Held in Conjunction with ECML-PKDD 2019, Würzburg, Germany, September 16, 2019, Proceedings (ed. Kani, B.) 83–99 (Springer International Publishing, 2020).
Favaro, L., Tirelli, T. & Pessani, D. Modelling habitat requirements of white-clawed crayfish (Austropotamobius pallipes). Knowl. Manag. Aquat. Ecosyst. 401, 21. https://doi.org/10.1051/kmae/2011037 (2011).
Nosair, A. M. et al. Predictive model for progressive salinization in a coastal aquifer using artificial intelligence and hydrogeochemical techniques: A case study of the Nile Delta aquifer. Egypt. ESPR 29, 9318–9340. https://doi.org/10.1007/s11356-021-16289-w (2022).
Leathwick, J. R., Elith, J. & Hastie, T. Comparative performance of generalized additive models and multivariate adaptive regression splines for statistical modelling of species distributions. Ecol. Model. 199(2), 188–196. https://doi.org/10.1016/j.ecolmodel.2006.05.022 (2006).
Ozcan, E. I. & Serdar, O. Artificial neural networks as new alternative method to estimating some population parameters of Tigris loach (Oxynoemacheilus tigris (Heckel, 1843)) in the Karasu River, Turkey. Fresenius Environ. Bull. 27(12B), 9840–9850 (2018).
Ozcan, E. I. Artificial neural networks (a new statistical approach) method in length-weight relationships of Alburnus mossulensis in Murat River (Palu-Elazığ) Turkey. Appl. Ecol. Environ. Res. 17(5), 10253–10266 (2019).
Berber, S. & Kale, S. Comparison of juvenile Astacus leptodactylus growth raised in cages in rice fields to other crayfish juvenile growth studies. TrJFAS 18(2), 331–341. https://doi.org/10.4194/1303-2712-v18_2_12 (2018).
Ricker, W. E. Linear regressions in fishery research. J. Fish. Res. Board Can. 30(3), 409–434. https://doi.org/10.1139/f73-072 (1973).
Sedik, Y., Rumahlatu, D., Irawan, B. & Soegianto, A. Morphometric characteristics of crayfish, Cherax gherardiae, from Maybrat, West Papua, Indonesia. Fish. Aquat. Life 26(4), 223–230. https://doi.org/10.2478/aopf-2018-0025 (2019).
Berber, S. & Balık, S. The lenght-weight relationships, and meat yield of crayfish (Astacus leptodactylus Eschcholtz, 1823) population in Apolyont Lake (Bursa, Turkey). J. Fish. Sci. 3(2), 86–99. https://doi.org/10.3153/jfscom.2009012 (2009).
Gültepe, Y. A comparative assessment on air pollution estimation by machine learning algorithms. EJOSAT 16, 8–15 (2019).
Gültepe, Y. Lung cancer prediction based on performance using different classification algorithms. CMC Comput. Mater. Con. 67(2), 2015–2028 (2021).
Maulud, D. H. & Abdulazeez, A. M. A review on linear regression comprehensive in machine learning. JASTT 1(4), 140–147 (2020).
Breiman, L. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/A:1010933404324 (2001).
Friedman, J. H. Greddy function approximation: A gradient boosting machine. Ann. Stat. 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451 (2001).
Syamala, K. & Rajeshwari, I. Enhanced gradient boosting regression tree for crop yield prediction. Int. J. Sci. Res. 9(3), 1651–1654 (2020).
Reilly, R. G. Learning in artificial neural network. In Encyclopedia of the Sciences of Learning (ed. Seel, N. M.) (Springer, 2012).
Gültepe, Y. & Duru, A. M. Daily SO2 air pollution prediction with the use of artificial neural network models. IJCA 181(34), 36–40. https://doi.org/10.5120/ijca2018918271 (2018).
Ahmad, U. et al. Rethinking the artificial neural networks: A mesh of subnets with a central mechanism for storing and predicting the data. IEEE Trans. Neural Netw. Learn. Syst. https://arxiv.org/abs/1901.01462 (2019).
Abdel-Sattar, M., Aboukarima, A. M. & Alnahdi, B. M. Application of artificial neural network and support vector regression in predicting mass of ber fruits (Ziziphus mauritiana Lamk.) based on fruit axial dimensions. PONE 16(1), 1–15 (2021).
Gültepe, Y. Analysis of Alburnus tarichi population by machine learning classification methods for sustainable fisheries. SLAS Tech. 27(4), 261–266. https://doi.org/10.1016/j.slast.2022.03.005 (2022).
Seber, G. A. & Lee, A. J. Linear Regression Analysis 2nd edn. (Wiley, 2003).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction 2nd edn, 8–9 (MIT Press, 2018).
Zelaya, C. V. G. Towards explaining the effects of data preprocessing on machine learning. In IEEE 35th International Conference on Data Engineering (ICDE) (ed. Kani, B.) 2086–2090 (IEEE, 2019).
Rafique, R., Islam, S. M. R. & Kazi, J. U. Machine learning in the prediction of cancer therapy. CSBJ 19, 4003–4017. https://doi.org/10.1016/j.csbj.2021.07.003 (2021).
de Bruin, G. J., Veenman, C. J., van Herik, H. J. & Takes, F. W. Experimental evaluation of train and test split strategies in link prediction. In Complex Networks & Their Applications IX Proceedings of the Ninth International Conference on Complex Networks and Their Applications Complex Networks 2020 79–91 (Springer International Publishing, 2020).
Khan A. Balanced split: A new train-test data splitting strategy for imbalanced datasets. arXiv.org, https://arxiv.org/abs/2212.11116. (2022).
Graczyk, M., Lasota, T. & Trawińsk, B. Comparative analysis of premises valuation models using KEEL, RapidMiner, and WEKA. In Computational Collective Intelligence. Semantic Web, Social Networks and Multiagent Systems (eds Nguyen, N. T. et al.) (Springer, 2009).
Gulland, J. A. Fish stock assessment: A manual of basic methods xii, 223 pp. John Wiley & Sons (FAO/Wiley series of food and agriculture, Vol. 1.). JMBA 64(1), 249–249. https://doi.org/10.1017/S0025315400059786 (1984).
Cadima, E. L. Fish Stock Assessment Manual 161 (FAO Fisheries Technical Paper, 2003).
Benzer, S. & Benzer, R. New perspectives for predicting growth properties of crayfish (Astacus leptodactylus Eschscholtz, 1823) in Uluabat Lake. Pak. J. Zool. 50(1), 35–45 (2018).
Chen, Y., Yu, H., Cheng, Y., Cheng, Q. & Li, D. A hybrid intelligent method for three-dimensional short-term prediction of dissolved oxygen content in aquaculture. PONE 13(2), e0192456. https://doi.org/10.1371/journal.pone.0192456 (2018).
Author information
Authors and Affiliations
Contributions
Y.G. applied machine learning techniques and checked the manuscript. S.B. collected data from the lake and checked the manuscript. N.G. analyzed data and wrote the main manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gültepe, Y., Berber, S. & Gültepe, N. Modeling and predicting meat yield and growth performance using morphological features of narrow-clawed crayfish with machine learning techniques. Sci Rep 14, 18499 (2024). https://doi.org/10.1038/s41598-024-69539-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-69539-5
Keywords
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.