Article Text

Download PDFPDF

eXtreme Gradient Boosting-based method to classify patients with COVID-19
  1. Antonio Ramón1,
  2. Ana Maria Torres2,
  3. Javier Milara1,3,
  4. Joaquín Cascón2,
  5. Pilar Blasco1,
  6. Jorge Mateo2
  1. 1 Pharmacy Department, General University Hospital Consortium of Valencia, Valencia, Spain
  2. 2 Institute of Technology, Universidad de Castilla-La Mancha, Cuenca, Spain
  3. 3 Pharmacy Department, University of Valencia, Valencia, Spain
  1. Correspondence to Dr Jorge Mateo, Institute of Technology, Universidad de Castilla-La Mancha, Cuenca, Castilla-La Mancha, Spain; jomateo2010{at}


Different demographic, clinical and laboratory variables have been related to the severity and mortality following SARS-CoV-2 infection. Most studies applied traditional statistical methods and in some cases combined with a machine learning (ML) method. This is the first study to date to comparatively analyze five ML methods to select the one that most closely predicts mortality in patients admitted with COVID-19. The aim of this single-center observational study is to classify, based on different types of variables, adult patients with COVID-19 at increased risk of mortality. SARS-CoV-2 infection was defined by a positive reverse transcriptase PCR. A total of 203 patients were admitted between March 15 and June 15, 2020 to a tertiary hospital. Data were extracted from the electronic medical record. Four supervised ML algorithms (k-nearest neighbors (KNN), decision tree (DT), Gaussian naïve Bayes (GNB) and support vector machine (SVM)) were compared with the eXtreme Gradient Boosting (XGB) method proposed to have excellent scalability and high running speed, among other qualities. The results indicate that the XGB method has the best prediction accuracy (92%), high precision (>0.92) and high recall (>0.92). The KNN, SVM and DT approaches present moderate prediction accuracy (>80%), moderate recall (>0.80) and moderate precision (>0.80). The GNB algorithm shows relatively low classification performance. The variables with the greatest weight in predicting mortality were C reactive protein, procalcitonin, glutamyl oxaloacetic transaminase, glutamyl pyruvic transaminase, neutrophils, D-dimer, creatinine, lactic acid, ferritin, days of non-invasive ventilation, septic shock and age. Based on these results, XGB is a solid candidate for correct classification of patients with COVID-19.

  • COVID-19

Data availability statement

Data are available upon reasonable request.

This article is made freely available for personal use in accordance with BMJ’s website terms and conditions for the duration of the covid-19 pandemic or until otherwise determined by BMJ. You may use, download and print the article for any lawful, non-commercial purpose (including text and data mining) provided that all copyright notices and trade marks are retained.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


  • COVID-19 causes severe acute respiratory syndrome manifesting clinically from asymptomatic to mild forms with cough, fever and myalgia, to triggering bilateral pneumonia with severe respiratory failure and multiorgan damage, which can lead to death.

  • A wide variety of clinical, laboratory and demographic variables associated with severity and mortality from COVID-19 have been identified, including but not limited to age, previous healthy status and laboratory parameters.

  • Most studies do not perform a comprehensive risk assessment to predict COVID-19-related mortality due to the increased number of clinical, laboratory and anthropometric variables which limits conclusions.


  • Machine learning, as part of artificial intelligence, is a useful tool to assign variables to predict COVID-19 mortality.

  • The eXtreme Gradient Boosting (XGB) model of machine learning was superior to decision tree, Gaussian naïve Bayes, k-nearest neighbor and support vector machines in predicting variables for COVID-19 mortality.

  • The variables that best predict COVID-19 mortality were levels of C reactive protein, procalcitonin, glutamate oxaloacetate transferase and glutamate pyruvate transferase transaminases, number of neutrophils, D-dimer, creatinine, septic shock and age.


  • The present work indicates that machine learning is a useful tool to predict mortality in hospitalized patients.

  • Between different types of machine learning procedures, XGB is the best tool that predicts mortality and can be used routinely to identify which patients have an increased risk of worsening.

  • This work identifies laboratory parameters that better predict mortality and which can be potentially used to stratify patients at risk.


COVID-19, caused by a coronavirus-2 (SARS-CoV-2) infection and causing severe acute respiratory syndrome, first emerged in Wuhan, Hubei, China in December 2019.1 The virus is highly transmissible, even more than SARS-CoV,2 manifesting clinically from asymptomatic or mild forms with cough, fever and myalgia, to triggering bilateral pneumonia with severe respiratory failure that requires mechanical ventilation and/or multiorgan damage that can lead to death.3 During the first wave, the mortality rate due to COVID-19 was less than 3%, although the fatality rate for severe cases is high, according to the WHO. The current global epidemiological situation is characterized by a high percentage of the population immunized against SARS-CoV-2, as well as an increase in the proportion of mild and asymptomatic cases, with the case fatality rate being less than 1%.4 In Spain, as of February 11, 2022, 10,555,197 cases of COVID-19 have been confirmed, including a total of 95,606 deaths.4 Case fatality rates help to understand the severity of the disease, identify populations at risk and assess the quality of healthcare. Predicting the clinical course of this disease based on several variables is of vital importance for proper patient management.

A wide variety of clinical, laboratory and demographic variables associated with severity and mortality from COVID-19 have been identified.5 6 Most studies did not perform comprehensive risk assessment to predict COVID-19-related mortality.7 8 To circumvent these drawbacks, machine learning (ML) models have emerged designed to make accurate predictions using data from a multitude of variables, as opposed to classic statistical models created to make inferences about relationships between variables. ML, as part of artificial intelligence (AI), uses statistical and mathematical algorithms that allow the opting of patterns that help in making complex decisions.9 These algorithms can be used to develop predictive models and reduce the complexity of clinical phenotypes. They are used in biomedicine as elements of clinical decision support and as generators of new clinical knowledge. For example, they have been used in the prediction of hospitalization for heart disease.10

On the other hand, progress has been made in the modeling of clinical data in electronic medical records (EMR) and specifically in the ability of ML techniques to predict mortality.11 ML constitutes an integrative method that allows observation of the combined effect of multiple variables and their interactions, allowing generation of knowledge about the disease from patients’ EMR data, and is a very useful tool in conditions where structured numerical data are readily available.

ML algorithms have been explored in different fields of COVID-19, mainly in the detection of outbreaks and spread of SARS-CoV-2,12 prediction of incidence rates,13 early diagnosis,14 prediction of risk of complications and severity,15 as well as prediction of mortality risk.16–20 Syeda et al 21 recently conducted a systematic review on the role of AI as a comprehensive and critical technology in combating the COVID-19 crisis in the fields of epidemiology, diagnosis and disease progression. Only 14.6% of the studies were related to the latter. Thus, a more precise approach to COVID-19 mortality is needed. To our knowledge, this is the first study to develop, compare and validate five ML models in predicting in-hospital mortality in patients admitted with COVID-19 in a tertiary-level hospital and research reference hospital in Spain during the first wave of the pandemic. Demographic, clinical and laboratory data easily extractable from the hospital EMR were used for prediction. The study is structured in a brief introduction highlighting the case fatality or mortality rate as a key variable in the study of a newly emerging disease such as COVID-19 and the importance of applying ML to numerous variables associated with hospital mortality. The Materials and methods section describes the types of variables included and the data collected and used for the different ML models applied. The Results section includes, among others, the accuracy values of the five validated algorithms for predicting hospital mortality from COVID-19. The Discussion section compares the results obtained in this study with the results of other studies using ML. Finally, the Conclusion section highlights the eXtreme Gradient Boosting (XGB) method over the other ML methods as a method for predicting mortality, facilitating patient stratification and optimizing medical resources.

Materials and methods

Data sources

Patient data were obtained from different internal sources of the hospital, such as the EMR (Hosix. Net. Ink.), which includes a module for registration of results of clinical analysis and a module for electronic prescription of drugs and the prescription program of the intensive care unit (ICU) (IntelliSpace Critical Care and Anesthesia, V.H.02.00, Philips Iberica). With this information, a data collection questionnaire (DCQ) was constructed individually by patient.

Study design and population

This is a retrospective observational study carried out in a tertiary-level hospital that attends a monthly average of 12,000 emergencies and 2000 hospital admissions. A total of 203 patients admitted to the hospital with SARS-CoV-2 were included. Inclusion criteria were all patients admitted to the Valencia University General Hospital Consortium with SARS-CoV-2 infection confirmed microbiologically by reverse transcriptase PCR assay of a nasopharyngeal swab between March 15 and June 15, 2020. The patients selected were admitted to the hospital during a period of ≥7 days. Exclusion criteria were patients ≤18 years old and patients with missing clinical data of more than one clinical/laboratory variable during this period. Participants gave informed consent before taking part in study.

Study data

Data on demographic, clinical and laboratory variables were included in the CRD. The questionnaire was divided into eight sections.

Patient characteristics

Demographic variables such as age and sex and the following clinical variables were included: weight, height and presence of comorbidities of interest (hypertension, diabetes mellitus, chronic obstructive pulmonary disease, asthma, other chronic respiratory disease (eg, pulmonary dysplasia, cystic fibrosis), use of oxygen therapy or presence of tracheostomy, heart failure, ischemic heart disease, pulmonary hypertension, recent catheterization, renal failure (RF), cirrhosis, history of neurological, active haematological or oncological neoplasia (with active treatment, diagnosis or recurrence/metastasis <5 years, excluding diagnosis of squamous cell and basal cell carcinoma), and HIV). In the event that the patient presented another type of serious underlying pathology, it was specified in an open-text section. The following were taken into account as pharmacological treatment prior to admission: ACE inhibitors/angiotensin-2 receptor antagonists, non-steroidal anti-inflammatory drugs, and antihistamines and/or montelukast, as well as whether the patient was a healthcare professional or if the previous stay was in a residence or another healthcare center.

Initial data on arrival at the hospital

These included date of admission to the emergency room, date of admission to the hospital, date of onset of symptoms, date of microbiological confirmation, limitation of life support treatment and its date, and whether the patient required admission to the ICU. If the patient was admitted to the ICU, data on the risk of mortality (CURB-65 scale (Confusion, Urea nitrogen, Respiratory rate, Blood pressure, 65 years of age and older)), level of altered consciousness (Glasgow Scale) and other clinical variables were included: fever (≥38°C), respiratory rate >24 breaths per minute and systolic blood pressure <90 mm Hg in the first 24 hours, baseline oxygen saturation (SpO2), and number of quadrants affected on the chest radiograph (1–4).

Data on admission to the ICU

These included date of admission, Acute Physiology And Chronic Health Evaluation (APACHE) II scores and Sepsis related Organ Failure Assessment (SOFA) scores.

Analytical data

The closest analysis after hospital admission (emergency/admission), the first analysis since admission to the ICU and the last analysis of the hospital stay were included. The laboratory parameters collected were leukocytes, neutrophils, lymphocytes, platelets, C reactive protein (CRP), glutamate oxaloacetate transferase (GOT), glutamate pyruvate transferase (GPT), lactate dehydrogenase (LDH), serum creatinine, hemoglobin, procalcitonin (PCT), lactic acid, creatine phosphokinase (CPK), D-dimer and ferritin.

Pharmacological treatment

This was taken into account if the patient participated in a clinical trial. The drugs considered were lopinavir/ritonavir, remdesivir, interferon beta, hydroxychloroquine, chloroquine, darunavir/cobicistat, darunavir/ritonavir, darunavir/cobicistat/tenofovir/emtricitabine, fosamprenavir, tocilizumab, sarilumab, ciclosporin, anakinra, tacrolimus, eculizumab, azithromycin, immunoglobulins, baricitinib and tofacitinib. For all these drugs, the dosage regimen and duration of treatment were included, and in the case of tocilizumab/sarilumab the levels of interleukin 6 (in pg/mL) and D-dimer (in μg/mL) were taken into account before and after treatment, as well as where treatment was started (ICU/no ICU). Other treatments included antibiotics, vasopressors, prescribed and/or bolus corticosteroids, and use of low molecular weight heparin, distinguishing between prophylactic or treatment doses. The corticosteroids included were methylprednisolone, hydrocortisone, dexamethasone and prednisone.

Microbiological tests

The isolated micro-organism was taken into account in all cases. The tests were tracheal aspirate, blood cultures, presence of influenza and/or coinfection, pneumococcal antigen and Legionella antigen in urine.

Techniques performed during admission

The following were included: oxygen therapy, non-invasive ventilation (NIV), mechanical ventilation, ventilation in prone position, hemodialysis/hemofiltration and extracorporeal membrane oxygenation system.

Final evolution of the patient

First, the severity of the SARS-CoV-2 infection was indicated according to the classification of severity levels of respiratory infections included in the COVID-19 clinical management protocol of the Ministry of Health on June 18, 2020. Complications during admission (acute respiratory distress syndrome (ARDS), sepsis, septic shock, nosocomial pneumonia (not COVID-19), other nosocomial infection (not COVID-19, not pneumonia), and acute renal and liver failure) were included. As a final assessment, improvement in symptoms (fever, cough, etc) together with radiological improvement and/or alveolar pressure / inspired oxygen fraction (PaFi) ≥300 mm Hg or SpO2 >93 without oxygen administration during the first 7, 14, 21 or 28 days of admission, depending on their duration, were recorded. A distinction was made between hospital discharge or exitus. The date and destination of discharge (home, residence or support center, or unknown destination), date of discharge of ICU, date of exitus and days of admission were included. Whether the patient was readmitted within 14 days after discharge was also taken into account.

Model development

An XGB-based method was implemented in this study because it is a flexible, highly efficient, portable and flexible supervised learning algorithm. The main advantages are that it is fast to run and is scalable and allows parallel computing.22–25 XGB algorithms are developed under the framework of gradient boosting. XGB features parallel tree boosting (also known as gradient-boosted decision trees), which solves many data science problems accurately and quickly. XGB is adopted to build a COVID-19 patient classification model. Given a data set S=xj, yj , the XGB model was designed using the following:

Embedded Image (1)

where xj is the input vector with m time variables, Embedded Image shows the predicted output, yj represents the output, tp represents a tree with leaf weight wp and structure up , j=1; 2;…; n, and P corresponds to the number of trees.

The regularized objective function for the proposed method is shown in equation 2. In this case, it is different from that of ensemble methods. In the proposed method, a second-order Taylor expansion is implemented to approximate the objective function of XGB in order to improve prediction accuracy.22 23 To control the complexity of the model and avoid overfitting, the regulation term is used, which is represented by the weights of the leaf nodes and the tree depth.

Embedded Image (2)

Embedded Image (3)

As can be seen in equation 3, fp corresponds to the tree pruning used to control overfitting. fp shows the number of leaves on the tree. Pruning is a method to improve generalization in trees. Once the trees are built, the proposed XGBoost performs a ‘pruning’ step that, starting at the bottom (where the leaves are) and moving up to the root node, looks to see if the gain falls below λ. If the first node encountered has a gain value below λ, then the node is pruned and the pruner moves up the tree to the next node. If, on the other hand, the node has a gain greater than λ, the node is left and the pruner does not check the parent nodes.23 24 26 The R () function penalizes the complexity of the method. The learning rate is shown by λ and w is the vector of leaf scores. R () represents a function that measures the difference between the target output Embedded Image and the expected output Embedded Image . To control the complexity weight of the system, a parameter γ is employed.23 24 26 To improve performance, this study seeks to minimize equation 2.

The functions of the functions in equation 2 are incorporated in the tree set model.23 24 26 Because of this, equation 2 cannot be optimized through traditional Euclidean space optimization systems. Therefore, in this study, Embedded Image was the j-th sample estimate at s-th iteration. With all these, equation 2 would look like the one shown in equation 4.

Embedded Image (4)

To reduce the objective function, the tree generated Cs by the j-th sample at the s-th iteration is added. Moreover, in the proposed method, the second-order approximation has been applied to optimize the objective function.22–24

Embedded Image (5)

where Embedded Image represents the first-order gradient statistic for the loss function R () and Embedded Image shows the second. The optimal weight w_rv of the license v for a fixed structure u(x) can be estimated as:Embedded Image

Embedded Image (6)

Finally, the optimal value can be achieved by means of equation 7 for the proposed method.

Embedded Image (7)

For this study, the proposed method was compared with different ML methods in order to classify patients into two groups: patients without risk and patients with risk of mortality from COVID-19. The methods involved decision tree (DT),27 Gaussian naïve Bayes (GNB),28 29 k-nearest neighbors (KNN),30 31 support vector machines (SVM)32 33 and the proposed method XGB.23 24 The MatLab Statistical and Machine Learning Toolbox (MatLab V.2021a; The MathWorks, Natick, Massachusetts, USA) was used to implement the models. A fivefold cross-validation was applied to avoid overfitting. The database was divided into two groups, 70% was used for training and 30% for testing, and patients were not shared.

The phases implemented for the whole study are described in figure 1. As can be seen, the subjects to be studied were first chosen. Once the database was created, training and validation of the ML methods were carried out.

Figure 1

Training and validation scheme for machine learning methods.

Performance evaluation

In this paper, the different methods were compared with the following metrics: degenerate Youden index (DYI), specificity, precision (also known as positive predictive value), recall (also known as sensitivity), balanced accuracy, receiver operating characteristic (ROC) and area under the curve (AUC). The F 1 score is described as:

Embedded Image (8)

Matthew’s correlation coefficient (MCC) was also used to test the performance of the ML methods, defined as:

Embedded Image (9)

where TP represents the number of true positives, FP is the number of false positives, TN shows the number of true negatives and FN corresponds to the number of false negatives. Cohen’s kappa index was used to estimate the overall performance of the system.34


This section describes the results obtained by using patient records for training and validation of COVID-19 mortality classification. The performance of the proposed system was compared with different ML methods that are accepted in the scientific community.

Table 1 presents the results achieved from the classification methods such as SVM, DT, GNB and KNN and the proposed system for mortality classification of patients with COVID-19. As can be seen, the systems based on SVM and GNB obtained lower accuracy value than the rest of the methods; these values are close to 81%. As for the DT and KNN methods, they show improved classification capability by obtaining an accuracy value of 83%. On the other hand, the proposed XGB system achieved an accuracy value of 92%, a significant increase over the previous methods, which translates to better prediction. The algorithms that come closest to XGB in terms of precision and recall values are KNN and DT, which again performed better than SVM and GNB. As can be seen in table 1, the same thing happens with parameter F 1 score, where XGB obtained higher values, which imply an improvement in classification.

Table 1

Mean value and SD of balanced accuracy, recall, precision, F 1 score, AUC, MCC, DYI and kappa of the machine learning models and the proposed method implemented in this study

To test the performance of the proposed XGB system in classifying mortality of patients with COVID-19, other parameters widely used in the literature, such as AUC, MCC, DYI and kappa index, were calculated. For this analysis, one of the most reliable statistical indices available, the MCC, was used. This coefficient produces a high score only if the prediction has been performed well in the four categories of the matrix. The results in the four categories of the confusion matrix (true positives, false negatives, true negatives and false positives) are proportional to the size of the positive elements and the size of the negative elements in the data set. As can be observed in table 1, the proposed method, XGB, achieved a value of 84.23%, increasing the values achieved by KNN and DT, which presented 75.16% and 72.94%. Both SVM and GNB showed worse performance in this parameter. As for the kappa index, XGB obtained a value close to 85%, improving the value of KNN and DT by 9.28% and 11.56%, respectively. The same is true for the AUC and DYI parameters: the XGB method achieved a higher value, which means it can better classify mortality in patients with COVID-19.

Figure 2 shows a summary of the comparison between the XGB method and the other classifiers with respect to accuracy, recall and precision. XGB achieved values of 0.924, 0.924 and 0.925, respectively, while those of KNN were 0.854, 0.855 and 0.860. Figure 2 also shows the values obtained for MCC, kappa and F 1 score. The proposed method obtained values of 0.842, 0.851 and 0.924, respectively. The next closest system to XGB is KNN, with values of 0.752, 0.758 and 0.860. In all parameters, it can be observed how the proposed method shows better performance in predicting mortality.

Figure 2

Graphical representation of precision, recall, accuracy, MCC, kappa and F 1 score values in percentages. DT, decision tree; GNB, Gaussian naïve Bayes; KNN, k-nearest neighbors; MCC, Matthew’s correlation coefficient; SVM, support vector machine; XGB, eXtreme Gradient Boosting.

On the other hand, ROC was used to compare the classification capability of the proposed system with that of other ML methods. The curve is the result of plotting, for each threshold value, the sensitivity and specificity.35 In figure 3, the results obtained by the different systems of classification between patients with COVID-19 mortality and those who survive are shown, where a larger area can be appreciated for the XGB method, which implies better classification of the two classes; the values can be seen in table 1.

Figure 3

ROC curves for the five assessed machine learning predictors. DT, decision tree; GNB, Gaussian naïve Bayes; KNN, k-nearest neighbors; ROC, receiver operating characteristic; SVM, support vector machine; XGB, eXtreme Gradient Boosting.

For clarity, all metrics have been grouped for each data set (training and test) and are presented as a radar plot. A perfect score on all metrics would be represented by a circle the size of the entire grid. In our study, model training sets have higher scores on all training set metrics and generally have lower scores on the test set. The shape of the graphs can also be indicative of the quality of the models. The larger the area of the circle of the test set, the better the prediction method will be. The proposed XGB system (figure 4) is a good example of a balanced model. The training and test sets give rise to similar pie charts. These similarities are due to the system obtaining an optimal training point, with no overfitting or underfitting, and therefore the method has high generalizability. That is, given a new input, the system does well to provide a correct output. As can be seen, the GNB method performed the worst on most metrics. In view of the results obtained, we can say that the proposed XGB system manages to classify patients with COVID-19 with high accuracy and in an automatic way, confirming the fact that this tool would be of great help in clinical practice.

Figure 4

Radar plot of the training phase (top) and test (bottom) for prediction of mortality in patients with COVID-19. AUC, area under the curve; DT, decision tree; GNB, Gaussian naïve Bayes; KNN, k-nearest neighbors; MCC, Matthew’s correlation coefficient; SVM, support vector machine; XGB, eXtreme Gradient Boosting.


The current SARS-CoV-2 pandemic is associated with high morbidity and mortality.36 37 Most mortality prediction models for COVID-19 that use ML are based partially or totally on subjective clinical data, which may vary depending on the study.38 As far as we know, this is the first study to develop, compare and evaluate five supervised ML methods in the Spanish population to predict mortality in patients admitted with COVID-19 in a tertiary hospital.

ML models analyzed and related work

Unlike other studies,17–20 273 clinical, demographic and laboratory predictors were included to fit the models. Of all the ML classifiers applied, the XGB method was the pattern recognition method that managed to more precisely discriminate between patients at risk of mortality from COVID-19 and those who are not. This model was analyzed and compared with different supervised ML methods described in the literature, such as GNB, DT, KNN or SVM. Current ML classification methods, used in biomedical applications, have shown that supervised algorithms, whether regression or classification, such as GNB, DT, KNN or SVM, usually have higher average accuracy than their unsupervised counterparts.39 40 In addition, individually applied methods are limited in their precision, but combination of methods, when applied correctly, can have higher overall classification precision, as is the case with the proposed XGB method.39 40 In our study, the SVM and GNB methods performed the worst, with KNN the method that most closely approximates the precision values of the proposed method. This is in line with the results of studies describing these supervised ML algorithms in predicting mortality from COVID-19.41–44 Different studies19 20 using the XGB method for COVID-19 mortality prediction obtained accuracy values above 90%, as in our case, but in North American and Chinese populations. The number of variables included in these studies was much lower than in our study, and in addition pharmacological treatments, both before and during hospital stay, were not considered as variables of interest. Our study provides a similar radar plot between the training and test phases, indicating that the system does not lose much predictive capability. The results show that the proposed model can handle large data dimensions, avoiding overtraining, and significantly improves the performance of other classification methods. It achieved higher values for precision, recovery and accuracy than those achieved by the other methods. This guarantees its reliability for the automatic classification of the desired result. XGB is a predictive model that has excellent scalability and high execution speed.45 It has been applied in biomedicine (table 2) to classify patients with cancer,46 epilepsy,47 atrial fibrillation48 and those at risk of hypertension,49 and to diagnose chronic kidney disease.50 Yu et al 51 and Zhong et al 52 took advantage of the XGB method to predict the location of submitochondrial and essential proteins in their respective work.

Table 2

Comparison of XGB method as a classification in biomedical applications

Predictors of mortality and related work

In our study, the predictors of mortality, in order of weighting, were CRP, PCT, GOT, NIV days, neutrophils, GPT, D-dimer, creatinine, septic shock, age, lactic acid and ferritin. White cell counts and platelets were also weighted to a lesser degree. Of the patients, 52.7% were male and 65.5% of the total were ≥65 years old. Of the patients, 22.7% were deceased and 16.2% were admitted to the ICU, with both percentages higher than in other studies.18 53 54 Consistent with other studies,16 17 20 advanced age was the main demographic predictor of hospital mortality in patients with COVID-19. The study by Sánchez-Montañes et al 18 applied different ML methods, with age being the most important predictor of mortality. The systematic review by Zheng et al 55 included 3027 patients and showed age ≥65 years (OR 6.06, 95% CI 3.98 to 9.22) as the factor that was most associated with progression of COVID-19. Other authors that used other ML models, such as the artificial neural network56 or the deep learning model,36 also highlighted age as a predictor of progression to a severe/critical clinical picture of severity and/or mortality. The clinical predictors included the scores obtained on the APACHE II, SOFA and CURB-65 scales. Although all of these are useful in predicting mortality in patients with COVID-19,57 in our study they did not have a significant weight. On the other hand, comorbidities such as diabetes and hypertension have been described as risk factors for poor prognosis and progression in patients with COVID-19.58 59 In our study, we did not find an association between these comorbidities and mortality from COVID-19, as is the case in other studies.53 History of cardiac comorbidities and CPK measurement were taken into account as a marker of cardiac dysfunction, unlike in other studies which preferentially used elevated cardiac troponin as an indicator of cardiac injury.15 36 SARS-CoV-2 interacts with the cardiovascular system on multiple levels and heart problems are associated with higher mortality in patients with COVID-19,15 36 although in our study there was no association in this regard. Other predictors that were positively associated with mortality were septic shock and NIV days, as described in the systematic review by Adamidi et al.43 Most of the studies in this review showed SpO2 and respiratory failure as predictors of mortality instead of talking about patients with NIV. Elevated blood urea nitrogen (BUN) and D-dimer and lymphocytopenia were associated with extrapulmonary disorders and possible multiorgan damage caused by COVID-19,16 all of which were a result of septic shock due to infection. The laboratory parameters were obtained after patients’ admission. Those related to altered kidney function, such as BUN and serum creatinine, were associated with a worse prognosis in these patients,60 similar to our case. Different studies have identified acute kidney injury (AKI) as a sequela in patients with severe COVID-19, many of whom died.61 A Cox regression analysis showed that proteinuria, hematuria, and elevated BUN and creatinine levels, among other characteristics, were significantly associated with death of patients with COVID-19.62 This analysis suggested that patients with COVID-19 who developed AKI are at risk of mortality ∼5.3 times greater than those without AKI. As in our research, other studies using ML17 19 20 36 57 63 identified the following laboratory parameters as predictors of severity and mortality: CRP, lactic acid, PCT, ferritin, D-dimer, GOT, GPT and neutrophils. PCT is elevated during bacterial infection, but less so during viral infection, suggesting that bacterial coinfection leads to worse outcomes in patients with COVID-19.36 Elevated serum ferritin is associated with ARDS.64 Wu et al 65 conducted a retrospective cohort study of 201 patients with COVID-19 and found that elevated serum ferritin was an independent risk factor related to the development of ARDS, but no similar association was observed in terms of mortality, possibly due to insufficient sample size. The meta-analysis of Henry et al 66 confirmed serum ferritin as a possible biomarker of progression to critical illness in patients with COVID-19. D-dimer has also been associated with mortality in patients with COVID-19.19 62 This is a marker of hypercoagulability and thrombosis that has been found to be elevated in patients with COVID-19.19 Concentrations greater than 1 µg/mL are associated with poor prognosis in the initial stages of the disease.53 Elevated GOT levels due to liver dysfunction have been seen in severe cases of COVID-19.67 Jiang et al 63 used supervised learning and found that elevation in GPT was predictive of severe ARDS in patients with COVID-19. Elevation of both enzymes and therefore liver disease are considered predictors of severity in these patients.68 Finally, low levels of leukocytes and neutrophils have also been described as predictors of severity,69 as well as thrombocytopenia described in critically ill patients with COVID-19.70 The recent systematic review by Bottino et al 44 concludes, as does our study, that among the predictors most associated with mortality are age and CRP and LDH levels.

XGB as a predictive model of mortality

XGB is the easiest binary classification method to implement and train, which means that as more data become available this algorithm will improve with respect to predictive performance.44 Similarly, Sánchez-Salmerón et al 71 in their systematic review highlight the XGB method as one of the models that achieve the highest level of predictive accuracy and can be a good tool to aid the triage process of patients with COVID-19. Wan et al 72 in their recent study used the random forest classifier with very similar characteristics to the XGB and obtained similar results with both.

Comparative studies have revealed that ML methods can be more accurate and efficient than traditional logistic regression analysis, especially when the sample size is limited.73 Including data from other modalities, such as genomic profiling and medical imaging, could further improve the predictive performance of the presented model. Since the length of hospital stay for most patients was greater than 1 week, our model can predict patients’ outcome more than 1 week in advance.


ML techniques are the most sophisticated and accurate tools for predicting events of interest in general and COVID-19 mortality prediction in particular. Of the five ML methods studied and validated, the XGB method obtained the highest accuracy in predicting hospital mortality due to COVID-19, with the following predictors of hospital mortality having the highest weight: nine biomarkers (CRP, PCT, GOT, GPT, neutrophils, D-dimer, creatinine, lactic acid and ferritin), days of NIV, septic shock and age. Other variables of interest were white cell counts and platelets. None of the pharmacological treatments included in the study had sufficient weight in predicting mortality for any of the models used.

The XGB method achieves a prediction value of 92%, improving by 6.95% the results shown by KNN, the second better ML method. The XGB method will help healthcare professionals in the process of stratifying cases and in making decisions about resource allocation and optimizing treatment for patients with COVID-19. This study will lay the groundwork for future multicenter studies with large inpatient and home-based populations. The results of this work will facilitate implementation of optimal economic and socio-health policies.

Data availability statement

Data are available upon reasonable request.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and was approved by the General University Hospital of Valencia. Participants gave informed consent to participate in the study before taking part.



  • Contributors AR, AMT, JMi, JC, PB and JMa contributed to the design of the work, and to the acquisition, analysis and interpretation of data for the work. AR, AMT, JMi, JC, PB and JMa contributed to the drafting of the work and revising it critically for important intellectual content. AR, AMT, JMi, JC, PB and JMa agreed on all aspects of the work related to the accuracy or integrity of any part of the work. AR, AMT, JMi, JC, PB and JMa contributed to the final approval of the version to be published. AR is responsible for the overall content as guarantor.

  • Funding This work was sponsored by the General University Hospital of Valencia (Spain), Fondo Europeo de Desarrollo Regional (FEDER) and Instituto de Salud Carlos III (PI20/01363; JMi), Centro de Investigaciones Biomedicas en Red de Enfermedades Respiratorias (CIBERES) (CB06/06/0027; JMi), and the Institute of Technology (University of Castilla-La Mancha).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.