Recurrent neural community fashions (CovRNN) for predicting outcomes of sufferers with COVID-19 on admission to hospital: mannequin growth and validation utilizing digital well being document information

Abstract
Background
Predicting outcomes of sufferers with COVID-19 at an early stage is essential for optimised scientific care and useful resource administration, particularly throughout a pandemic. Though a number of machine studying fashions have been proposed to handle this difficulty, due to their necessities for intensive information preprocessing and have engineering, they haven’t been validated or applied outdoors of their authentic examine website. Due to this fact, we aimed to develop correct and transferrable predictive fashions of outcomes on hospital admission for sufferers with COVID-19.
Strategies
On this examine, we developed recurrent neural network-based fashions (CovRNN) to foretell the outcomes of sufferers with COVID-19 by use of obtainable digital well being document information on admission to hospital, with out the necessity for particular function choice or lacking information imputation. CovRNN was designed to foretell three outcomes: in-hospital mortality, want for mechanical air flow, and extended hospital keep (>7 days). For in-hospital mortality and mechanical air flow, CovRNN produced time-to-event danger scores (survival prediction; evaluated by the concordance index) and all-time danger scores (binary prediction; space beneath the receiver working attribute curve [AUROC] was the principle metric); we solely skilled a binary classification mannequin for extended hospital keep. For binary classification duties, we in contrast CovRNN in opposition to conventional machine studying algorithms: logistic regression and lightweight gradient increase machine. Our fashions had been skilled and validated on the heterogeneous, deidentified information of 247 960 sufferers with COVID-19 from 87 US health-care techniques derived from the Cerner Actual-World COVID-19 Q3 Dataset as much as September 2020. We held out the information of 4175 sufferers from two hospitals for exterior validation. The remaining 243 785 sufferers from the 85 well being techniques had been grouped into coaching (n=170 626), validation (n=24 378), and multi-hospital check (n=48 781) units. Mannequin efficiency was evaluated within the multi-hospital check set. The transferability of CovRNN was externally validated by use of deidentified information from 36 140 sufferers derived from the US-based Optum deidentified COVID-19 digital well being document dataset (model 1015; from January, 2007, to Oct 15, 2020). Precise dates of information extraction had been masked by the databases to make sure affected person information security.
Findings
CovRNN binary fashions achieved AUROCs of 93·0% (95% CI 92·6–93·4) for the prediction of in-hospital mortality, 92·9% (92·6–93·2) for the prediction of mechanical air flow, and 86·5% (86·2–86·9) for the prediction of a chronic hospital keep, outperforming gentle gradient increase machine and logistic regression algorithms. Exterior validation confirmed AUROCs in comparable ranges (91·3–97·0% for in-hospital mortality prediction, 91·5–96·0% for the prediction of mechanical air flow, and 81·0–88·3% for the prediction of extended hospital keep). For survival prediction, CovRNN achieved a concordance index of 86·0% (95% CI 85·1–86·9) for in-hospital mortality and 92·6% (92·2–93·0) for mechanical air flow.
Interpretation
Skilled on a big, heterogeneous, real-world dataset, our CovRNN fashions confirmed excessive prediction accuracy and transferability by persistently good performances on a number of exterior datasets. Our outcomes present the feasibility of a COVID-19 predictive mannequin that delivers excessive accuracy with out the necessity for advanced function engineering.
Funding
Most cancers Prevention and Analysis Institute of Texas.
Introduction
Coronavirus illness (COVID-19) pandemic.
By the top of 2021, there have been greater than 295 million confirmed SARS-CoV-2 infections worldwide and greater than 825 000 deaths attributable to COVID-19 within the USA alone.
COVID information tracker.
Moreover, there have been round 3·7 million COVID-19-related hospital admissions recorded since August 2020 within the USA.
COVID information tracker.
Throughout the peaks of the pandemic waves, many US states reported near-capacity hospital and intensive care unit use. Correct prediction of the long run scientific trajectories of sufferers with COVID-19 on the time of admission is essential for scientific resolution making and permits the environment friendly allocation of assets. Certainly, a number of fashions for the prediction of COVID-19 outcomes have been developed. Wynants and colleagues
- Wynants L
- Van Calster B
- Collins GS
- et al.
reviewed 107 COVID-19 prognostic fashions revealed earlier than July 1, 2020. The commonest difficulty highlighted on this examine was the excessive danger of bias related to the reviewed fashions, which was brought on by both a small, regionally sourced coaching dataset and the next excessive danger of mannequin overfitting or the absence of mannequin calibration or exterior validation.
- Sperrin M
- Grant SW
- Peek N
,
By way of an up to date survey of the literature, as of Dec 31, 2021, we discovered that solely 4 research
- Schwab P
- Mehrjou A
- Parbhoo S
- et al.
,
- He F
- Web page JH
- Weinberg KR
- Mishra A
,
Feng A. Utilizing digital well being data to precisely predict COVID-19 well being outcomes by a novel machine studying pipeline. Proceedings of the twelfth ACM Convention on Bioinformatics, Computational Biology, and Well being Informatics; Aug 1, 2021 (abstr 61).
,
- Bennett TD
- Moffitt RA
- Hajagos JG
- et al.
concerned coaching the proposed fashions of COVID-19 outcomes on information from greater than 20 000 sufferers. Furthermore, all 4 fashions are primarily based on a small set of particular options and want a laborious information preprocessing and have engineering course of that limits the transferability, reliability, and sustainability of the fashions.
Proof earlier than this examine
Though many strategies for predicting COVID-19 outcomes have been developed, they haven’t been extensively externally validated attributable to their restricted transferability. A key impediment to the transferability of such strategies is the necessity for laborious information preprocessing and have engineering. A 2020 systematic overview that critically assessed prediction fashions for diagnosing and prognosing COVID-19 revealed that almost all of 107 prognostic fashions revealed earlier than July 1, 2020, have a excessive danger of overfitting bias. Utilizing the Prediction mannequin Danger Of Bias ASsessment Software (often called PROBAST), the authors recognized frequent causes for biased outcomes, together with coaching the mannequin on a small, regionally sourced dataset, which results in a excessive danger of mannequin overfitting, and the absence of mannequin calibration or exterior validation. To supply an up to date survey of the literature, we searched Scopus and PubMed for articles revealed in English between July 1, 2020, and Dec 31, 2021 predicting COVID-19 outcomes utilizing the key phrases “COVID digital well being document (‘mortality’ or ‘ventilator’ or ‘size of keep’ or ‘real-time’) prediction”. The literature search retrieved a complete of 466 distinctive articles, and, on overview, we discovered 53 research that describe the event and validation of machine studying predictive fashions for predicting prognosis for sufferers with COVID-19 after admission. Of the 53 research, solely 4 concerned coaching and evaluating the fashions on a multi-sourced cohort of greater than 20 000 sufferers with COVID-19. The proposed fashions in these research, nevertheless, nonetheless require intensive information preprocessing and have engineering, which limits the transferability, reliability, and sustainability of such fashions.
Added worth of this examine
We suggest a machine studying mannequin coaching framework that may flexibly adapt to the altering pandemic and requires minimal preprocessing. For comfort and practicality, our framework is designed to devour digital well being document information mapped to plain terminologies in frequent use with out the necessity for particular function choice or lacking worth imputation. As a result of they had been skilled and evaluated on massive, heterogenous datasets collected from completely different well being techniques, our COVID-19 final result prediction fashions (CovRNN) confirmed excessive accuracy in predicting three outcomes (in-hospital mortality, want for mechanical air flow, and extended hospital keep), outperforming the prediction accuracy of state-of-the-art fashions within the literature, good calibration, and had a low danger of bias. As well as, our fashions will be fine-tuned on new information for steady enchancment, as really useful by the US Meals and Drug Administration’s Good Machine Studying Observe. Moreover, our framework features a utility for mannequin predictions rationalization to facilitate scientific judgment of the mannequin predictions.
Implications of all of the out there proof
Whereas consuming structured, categorical information from digital well being data, deep learning-based fashions can obtain state-of-the-art prediction accuracy of their normal format with out the necessity for options choice or lacking worth imputations, which means that the skilled fashions will be simply validated on new information sources. We validated our skilled fashions throughout datasets from completely different sources, indicating the transferability of our fashions. Our mannequin growth framework will be additional utilized to coach and consider predictive fashions for various kinds of scientific occasions. For clinicians who’re combating COVID-19 on the frontlines, there are two probably actionable contributions of our work. Clinicians can (1) fine-tune our pretrained fashions on their native information (no matter cohort measurement), set up utility, after which deploy the fashions and (2) use our complete mannequin growth framework to coach a predictive mannequin utilizing their very own information.
- Rasmy L
- Wu Y
- Wang N
- et al.
,
- Xiang Y
- Ji H
- Zhou Y
- et al.
,
- Rasmy L
- Zhu J
- Li Z
- et al.
,
- Wanyan T
- Honarvar H
- Jaladanki SK
- et al.
,
- Rajkomar A
- Oren E
- Chen Ok
- et al.
Outcomes
Desk 1Descriptive statistics for CRWD and OPTUM extracted cohorts
Information are median (IQR) or n (%). CRWD=Cerner Actual-World COVID-19 Q3 Dataset. NA=not relevant. OPTUM=Optum deidentified COVID-19 digital well being document dataset.
Desk 2Mannequin efficiency on completely different CRWD check units
Information are space beneath the receiver working attribute curve (95% CI), except in any other case indicated.
Desk 3Efficiency of CovRNN fashions on the OPTUM check set earlier than and after fine-tuning
Information are space beneath the receiver working attribute curve, except in any other case indicated. All information are primarily based on analysis within the OPTUM check set. CRWD=Cerner Actual-World COVID-19 Q3 Dataset. OPTUM=Optum deidentified COVID-19 digital well being document dataset.

Determine 3Kaplan-Meier curves within the stratified survival evaluation
In-hospital mortality (A) and mechanical air flow (B) within the multi-hospital check set of the Cerner Actual-World COVID-19 Q3 Dataset. In-hospital mortality (C) and mechanical air flow (D) within the check set of the Optum deidentified COVID-19 digital well being document dataset. Stratification of sufferers is in keeping with their predicted survival rating over time in days since admission. Shaded areas point out 95% CIs calculated on the logarithmic scale from the SEs of the Kaplan–Meier estimator with the centre values similar to the Kaplan–Meier estimate.

Determine 4Subgroup evaluation utilizing the CRWD multi-hospital check set
(A) Age group. (B) Comorbidity. (C) US census area. (D) Race. AUROC=space beneath the receiver working attribute curve. CRWD=Cerner Actual-World COVID-19 Q3 Dataset.

Determine 5Calibration plots for the CRWD validation set, CRWD multi-hospital check set, and OPTUM check set
(A) In-hospital mortality. (B) Mechanical air flow. (C) Extended hospital keep. CRWD=Cerner Actual-World COVID-19 Q3 Dataset. OPTUM=Optum deidentified COVID-19 digital well being document dataset.
Dialogue
Our experiments confirmed that CovRNN fashions skilled on a big heterogeneous dataset of roughly 200 000 sufferers with COVID-19 required minimal information curation to realize excessive prediction accuracy (AUROC 86·0–97·0%) for various affected person scientific outcomes, specifically in-hospital mortality, mechanical air flow, and extended hospital keep. CovRNN not solely confirmed excessive prediction accuracy but in addition good transferability between two massive deidentified digital well being document databases with completely different buildings, good exterior validity, correct mannequin calibration, and the utility of fine-tuning for steady enchancment. As well as, we used built-in gradients to reveal the elements that contribute to the model-predicted scores.
CovRNN fashions persistently outperformed different strategies (logistic regression and lightweight gradient increase machine). Apparently, we discovered that the utmost distinction between the AUROC estimates made by logistic regression, gentle gradient increase machine, and CovRNN fashions was round 3% for in-hospital mortality and mechanical air flow, whereas the distinction exceeded 6% for the prediction of extended hospital keep. Equally, we noticed that the accuracy of predicting a chronic hospital keep was extremely affected by the inclusion of full affected person historical past versus data from the final (index) go to solely. Due to this fact, we imagine that contemplating the sequence of occasions that occurred previously is of upper significance for the extended hospital keep prediction job than for the in-hospital mortality and mechanical air flow prediction duties, for which we infer that the newest occasions are of upper significance.
- Schwab P
- Mehrjou A
- Parbhoo S
- et al.
,
- Bennett TD
- Moffitt RA
- Hajagos JG
- et al.
,
- Villegas M
- Gonzalez-Agirre A
- Gutiérrez-Fandiño A
- et al.
,
- Razavian N
- Main VJ
- Sudarshan M
- et al.
,
- Yadaw AS
- Li Y-c
- Bose S
- Iyengar R
- Bunyavanich S
- Pandey G
our fashions had been skilled and evaluated on bigger, multicentre cohorts from two massive, well-known, deidentified digital well being document databases from the USA (a complete of 284 100 sufferers). CovRNN outperforms different prediction fashions for COVID-19 outcomes which were skilled and evaluated on greater than 50 000 sufferers with COVID-19 and generally depend on boosting-based algorithms.
- He F
- Web page JH
- Weinberg KR
- Mishra A
,
Feng A. Utilizing digital well being data to precisely predict COVID-19 well being outcomes by a novel machine studying pipeline. Proceedings of the twelfth ACM Convention on Bioinformatics, Computational Biology, and Well being Informatics; Aug 1, 2021 (abstr 61).
The N3C examine
- Bennett TD
- Moffitt RA
- Hajagos JG
- et al.
included an analogous variety of sufferers with COVID-19 (160 000) of their coaching set; nevertheless, their reported prediction accuracy (AUROC) for in-hospital mortality and mechanical air flow (mixed as a severity indicator) was 87% (95% CI 86–88). As well as, nearly all of revealed research with machine studying fashions predict outcomes in a really brief follow-up window, comparable to 1 h or 1 day from the index timepoint.
- Schwab P
- Mehrjou A
- Parbhoo S
- et al.
,
- Yadaw AS
- Li Y-c
- Bose S
- Iyengar R
- Bunyavanich S
- Pandey G
Moreover, some research didn’t specify the time window of prediction or used restricted historic information.
- Estiri H
- Strasser ZH
- Murphy SN
As window intervals change into shorter, the prediction job turns into simpler, and, thus, accuracy will increase; nonetheless, the outcomes are much less useful as physicians can predict short-term scientific outcomes higher with out utilizing fashions. We reported the outcomes of our CovRNN survival fashions to indicate the pliability of our strategy. We imagine, nevertheless, that predicting the likelihood of hostile occasions occurring inside the hospital keep ought to be informative sufficient for clinicians to make acceptable choices on admission and won’t be restricted to a selected time vary. Due to this fact, we additionally centered on the analysis and calibration of the binary classification fashions.
Healthcare staff. Interim scientific steerage for administration of sufferers with confirmed coronavirus illness (COVID-19).
This attribute scientific course in sufferers with COVID-19 makes it considerably troublesome for clinicians to foretell future outcomes on the primary day of hospital encounters. Our fashions are notably useful in these scientific eventualities as a result of they predicted the prevalence of in-hospital mortality with a specificity of 70·93% at 95% sensitivity. The brink will be simply adjusted to prioritise sensitivity or specificity to satisfy clinicians’ wants. For instance, in a state of affairs by which our fashions predict the affected person’s demise with excessive specificity, physicians may provoke an early dialogue of poor outcomes with the affected person and targets of care in acceptable circumstances. As the chance of additional COVID-19 surges nonetheless can’t be dominated out and eventualities of health-care techniques being overwhelmed with sufferers are nonetheless a definite chance, CovRNN generally is a useful gizmo whereas triaging sufferers. The rating offered by CovRNN can be utilized to risk-stratify massive numbers of sufferers on the premise of their available information in a number of seconds. The minimal want for information curation and reliance on the facility of the deep studying mannequin structure for studying correct function representations from massive information are key benefits of our CovRNN fashions. We had been in a position to switch the fashions between two datasets that differ in a number of methods, comparable to within the distribution of scientific codes. With a easy mannequin fine-tuning step on pattern information from the vacation spot dataset, the fashions persistently achieved excessive prediction accuracy. Though we centered on the outcomes of sufferers with COVID-19, this examine is proof of idea that we may apply the identical methodology to foretell completely different scientific circumstances.
Our examine has a number of limitations. First, our information evaluation included solely retrospective information. Regardless of our efforts to keep away from potential bias by separating coaching, validation, and check datasets and conducting exterior validation on a unique information supply, potential biases are inevitable. A potential validation examine is warranted, ideally, in hospitals that didn’t take part in information sharing with the database that we used to safe the validation of transferability. Second, our fashions centered solely on predicting scientific outcomes on the time of hospital admission. It’s attainable to make use of a number of timepoints through the hospital keep to replace fashions to realize real-time predictions. As a result of minimal information preprocessing is required, our fashions will be simply modified to make use of completely different datapoints to foretell future scientific outcomes. Third, real-world structured information from digital well being data aren’t all the time related to normal codes. For instance, information from Cerner Millennium won’t be codified in any respect within the supply system or can solely be related to purchasers’ proprietary occasion codes. Hospitals generally have entry to utilities to map their structured digital well being document information into industry-standard codes, which we utilized in our fashions, to facilitate interoperability, Quick Healthcare Interoperability Assets (often called FHIR) queries, information sharing, billing, and public well being reporting duties. Such utilities are generally offered natively by their digital well being document distributors. Such mappings are required to get profit from our pretrained CovRNN fashions; in any other case, we suggest utilizing our CovRNN coaching framework to coach suitable fashions utilising the system’s proprietary occasion codes. Nevertheless, these codes or representations are solely significant within the context of the originating system, and they don’t seem to be useful to coach transferable fashions.
COVID-19 vaccine breakthrough infections reported to CDC—United States, January 1–April 30, 2021.
As a result of our fashions are skilled on historic information, they are often simply fine-tuned on extra present information to enhance prediction accuracy, which is likely one of the main benefits of deep studying fashions. Future work is warranted to fine-tune and consider our fashions on information from later pandemic waves.
By way of benchmarking, we discovered that CovRNN can present correct and transferable predictive fashions for a variety of outcomes and that we will repeatedly enhance upon the fashions by periodic fine-tuning. Moreover, our information preparation pipeline was stored to a minimal to facilitate the transferability of the fashions and facilitate additional validation on new information sources. Our mannequin growth framework will be additional utilized to coach and consider predictive fashions for various kinds of scientific occasions. For clinicians who’re combating COVID-19 on the frontlines, there are two probably actionable contributions of our work. Clinicians can (1) fine-tune our pretrained fashions on their native information, no matter cohort measurement, set up utility, after which deploy the fashions and (2) use our complete mannequin growth framework to coach a predictive mannequin utilizing their very own information.
In conclusion, to the perfect of our data, CovRNN fashions are the primary COVID-19 final result prediction fashions that may concurrently precisely predict completely different outcomes on admission for sufferers with COVID-19 and use available structured information from digital well being data of their categorical format with out the necessity for particular function choice or lacking worth imputation. We additionally confirmed the worth added by the fine-tuning utility of CovRNN and the way it may be used to enhance fashions’ prediction accuracy. Such utility will be additional used to repeatedly enhance the fashions, as per Good Machine Studying Observe suggestions, to safe the fashions’ reliability and sustainability.
LR and DZ conceived the concept for this examine. LR led the design and implementation of experiments. LR, KP, MN, and BSK reviewed the proof earlier than the examine. MN contributed to the dialogue and the mannequin rationalization analysis. ZX contributed to the mannequin rationalization. LR, BM, and KP ran the experiments on the OPTUM information. LR and YZ extracted the digital well being data information. WZ added the visualisations. LR led the manuscript writing. BSK, MN, HX, and DZ contributed to the writing. AR assessed the examine in opposition to TRIPOD and PROBAST requirements. HX and DZ supervised the mission. LR and DZ finalised the manuscript. LR, BSK, and MN accessed and verified the CRWD information. LR, YZ, BM, and KP accessed and verified the OPTUM information. All coauthors reviewed and authorized the manuscript. All authors had full entry to all the information within the examine and had ultimate accountability for the choice to submit for publication.