If you were selected to work on a COVID research team, how would you build your prediction model?

1. Identify the dependent variable

  • Clinical severity
    • Operationalization:
      • Mild
      • Moderate (hospitalized but not on ventilators)
      • Severe (hospitalized and on ventilators)
      • Deceased (from COVID-19)

2. List all the possible predictors

  • Age
  • Gender (female/male)
  • Race (African American/Asian/White/Other)
  • Ethnicity (Hispanic/Non-Hispanic)
  • Obesity (Body mass index)
  • Preexisting health conditions (e.g. diabetes)
  • Insurance (has insurance/does not have insurance)

3. Work up a strategy and plan for your prediction model

  • Use the UT Dallas Proactive Testing database
  • Split the tested individuals with laboratory-confirmed positive COVID-19 test into randomly selected 70% training and 30% testing sets

3a. Estimation

  • Test differences among the four DV groups by using multivariate logistic regression

3b. Resampling methods

  • Use bootstrap resampling technique by running multiple simulations on the dataset. In consecutive simulations, parameters will be adjusted to obtain the most precise estimates.