Content area
Full Text
Due to the global pandemic of a novel coronavirus, many hospitals started running low on beds, ventilators, and staff. Therefore, it is extremely essential to predict the patient's Length of Stay (LOS) to observe a dynamic estimation for the hospital's capacity. In this work, different machine learning prediction algorithms were deployed for predicting the patients' LOS based on eleven classes. The prediction algorithm's performance was evaluated using the following performance parameters: accuracy, AUROC, sensitivity, and specificity. The maximum predicting accuracy was obtained using the CatBoost algorithm and it was found to be 0.46 while the corresponding sensitivity and specificity were 0.25 and 0.93, respectively. As an attempt for further accuracy boosting, the number of classes was reduced to five classes. This number of classes was derived from the elbow diagram. Two methods for combining the eleven classes into five classes were introduced. Finally, the predictivity of the different algorithms for the updated classes was investigated and feature selection was applied for reducing complexity and improving the models' accuracy. A considerable improvement was observed when reducing the number of classes where the highest accuracy was also achieved using the CatBoost algorithm and it was equal to 0.68.
1.Introduction and Related Research
Recently, the Covid-19 pandemic has triggered the alarm for capacity shortage in many hospitals. Even though some hospitals exhibited a sharp decline in acute care hospital admissions when comparing the Covid-19 era with the preCovid-19 era, many big-city hospitals are being engulfed by the novel coronavirus. This has aroused the attention over-improving the healthcare management sector. Among the tools used in improving the efficiency of this sector, data science has been identified as an effective tool in detecting patterns, identifying behaviors by analyzing the data, and finally making predictions for effective decision making. In fact, data science can be expanded to cover various types of applications [1, 2].
There have been many studies that have applied different classification methods for classifying the LOS. [3] used the supervised classification techniques for adult sepsis patient mortality prediction. Their results showed that the best prediction was obtained by the SVM and Artificial Neural Network (ANN) using physiological variables. [4] used machine learning techniques specifically, RF, NB, ANN, and LR for diagnosis of fatty liver disease, the accuracy of...