If you were selected to work on a COVID research team, how would you build your prediction model?
1. Identify the dependent variable
- Clinical severity
- Operationalization:
- Mild
- Moderate (hospitalized but not on ventilators)
- Severe (hospitalized and on ventilators)
- Deceased (from COVID-19)
2. List all the possible predictors
- Age
- Gender (female/male)
- Race (African American/Asian/White/Other)
- Ethnicity (Hispanic/Non-Hispanic)
- Obesity (Body mass index)
- Preexisting health conditions (e.g. diabetes)
- Insurance (has insurance/does not have insurance)
3. Work up a strategy and plan for your prediction model
- Use the UT Dallas Proactive Testing database
- Split the tested individuals with laboratory-confirmed positive COVID-19 test into randomly selected 70% training and 30% testing sets
3a. Estimation
- Test differences among the four DV groups by using multivariate logistic regression
3b. Resampling methods
- Use bootstrap resampling technique by running multiple simulations on the dataset. In consecutive simulations, parameters will be adjusted to obtain the most precise estimates.