Article Text

Download PDFPDF

Correspondence on ‘Prospective predictive performance comparison between clinical gestalt and validated COVID-19 mortality scores’ by Soto-Mota et al
  1. Héctor David Meza-Comparán
  1. Instituto Nacional de Geriatría, Mexico City, Mexico
  1. Correspondence to Dr Héctor David Meza-Comparán, Dirección de Investigación, Instituto Nacional de Geriatría, Mexico City, Mexico; hmezacomparan{at}

Statistics from

Dear Editor,

I read the article ‘Prospective predictive performance comparison between clinical gestalt and validated COVID-19 mortality scores’ with great interest.1 The authors compared various COVID-19 mortality prediction models validated in Mexican patients — LOW-HARM, MSL-COVID-19, Nutri-CoV, and neutrophil-to-lymphocyte ratio (NLR) —, qSOFA, and NEWS2 against clinical gestalt to predict mortality among COVID-19 patients admitted to a tertiary hospital, concluding that clinical gestalt was non-inferior. I would like to comment on some issues with this article.

It is unclear what “clinical gestalt” meant in the study since no formal definition was provided by the authors other than study procedures. Others have defined clinical gestalt as “a physician’s unstructured estimate”2 or an “overall clinical impression”.3

Additionally, it is not clear how the authors selected the prediction models to be evaluated. They mentioned that three models validated in datasets including Mexican patients were included; however, in the absence of clear inclusion criteria, other models validated in Mexican patients could have been left out. Thus, I performed a systematic search within the COAP search engine and LILACS of studies published to November 5, 2021 (figure 1). Nine studies describing 17 validated COVID-19 mortality prediction models within the Mexican population were identified (table 1),4–12 four of which were evaluated by Soto-Mota and colleagues (LOW-HARM, MSL-COVID-19, Nutri-CoV, and NLR).4 7 Therefore, the authors did not evaluate a number of the important prediction models validated in Mexican patients to predict mortality.

Figure 1

Systematic search flowchart of studies included and reasons for exclusion. The search within COAPa was performed by using the keywords and Boolean operators (mortality) AND (mexico) OR (mexican). Within LILACSb, the keywords and Boolean operators (COVID-19) AND (mortality) AND (mexico) OR (mexican) were used; an affiliation country filter for “Mexico” was also applied in the latter case. These searches retrieved 778 records (610 and 168, respectively), of which 193 studies were retained for abstract and full-text screening. Nine studies describing 17 validated COVID-19 mortality prediction models within the Mexican population were identified. aCOAP is a daily-updated database with SARS-CoV-2 and COVID-19 published articles from PubMed, EMBASE and PsycINFO, and preprints from medRxiv and bioRxiv (further information at bLILACS is one of the most important and comprehensive databases of scientific information in Latin America and the Caribbean with more than 880 thousand records of peer-reviewed journals, thesis and dissertations, government documents, annals of congresses and books (further information at

Table 1

COVID-19 mortality prediction models validated in Mexican patients, identified through the systematic search.

Although the authors mentioned the median years of hospital experience (which could include medical internship and social service in Mexico) in medical residents who performed predictions, disclosing their corresponding postgraduate year (PGY) would have been important, since confidence of predictions was generally low in this study — only ~35% had >80% confidence of prediction. While they argued that “with the COVID-19 pandemic, clinicians of all levels of training started their learning curve at the same time”, senior residents are less likely to be under-confident compared with junior residents.13

Furthermore, the statement “no score was significantly better than clinical gestalt predictions” might be questionable, due to concerns regarding sample size. An inadequate sample size could have led to the inability to detect differences, especially since the authors used easyROC — an open web calculator that estimates, among others, sample sizes for non-inferior ROC comparisons — to estimate the sample size for their study. Of note, easyROC requires an input for the “smallest difference” between tests’ AUC, not the “maximal AUC difference” as the authors report. Most important is the fact that easyROC was not developed to estimate sample sizes to evaluate non-inferiority between prognostic predictive models; instead, it was developed to compare diagnostic test models.14

Finally, it is worthwhile mentioning that while in younger patients obesity is the strongest risk factor for short-term mortality,15 chronological age remains the single most important predictor of in-hospital COVID-19 mortality.9

Ethics statements

Patient consent for publication

Ethics approval

This study does not involve human participants.


I would like to thank Javier Mancilla-Galindo and Ashuin Kammar-García for their invaluable comments and recommendations regarding the manuscript, as well as Vianey Fragoso-Saavedra for her support.



  • Twitter @HectorMezaMD

  • Contributors HDM-C assumes sole responsibility for the drafting, writing and revising of the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles