statsreport
Project #1 – STAT 4660 Fall 2022
Due Friday March 4 by midnight
In this this project you will conduct a simulation study of different COVID mitigation strategies and use it to make a case for what you think the best strategy would have been. You will simulate a town of population approximately 10,000 people for several years (or however long it takes for the epidemic to run its course). In your town, people should live in households with other people and go to work or school. This will allow you to model disease transmission dynamics in a more realistic way than the previous models that we have studied. You should have a moderately realistic structure to your town in terms of household composition and number and size of businesses. You introduce the disease into the town in a small number of individuals and then see how the disease spreads. You should test at least the following scenarios:
1) No mitigation strategies are attempted. 2) Total lockdown with no one leaving home. 3) Lockdowns with e.g. 90%, 80%, …50%,… of people staying home. 4) Measures such as masking and social distancing. You can model these simply as reductions in
transmission probabilities. 5) School closing only. Businesses stay open. 6) Businesses only closing. Schools stay open. 7) You should consider these scenarios with and without a vaccine.
You should consider metrics such as total deaths, total cases, number of people with long COVID, etc. It might make sense to consider not just deaths, but years of life lost (i.e. a child dying is many more years of life lost than an 80-year-old and this is relevant). The various parameters in your model should be based on real data. This includes:
- Household composition (i.e. number of households with one person, two people, two people with kids, how many kids, etc).
- Infection rates in different settings: home, school, work. - Mortality rates by age. - Length of time that someone is infectious. - Probability of long COVID. You should write a report that explains what you did, your results with figures/tables, and your
conclusions about the best COVID strategy. In order to get a good grade, you must do the following: - Clearly explain your assumptions (i.e. the rules of how your simulation works). - Make a reasonable effort to have realistic assumptions. - Have a thorough discussion of caveats and shortcomings of your model and results. - Have your code match your stated assumptions. - Have your model parameters based on data and reference the source of that data. - Have code that is well organized and documented. - Have appropriate figures and tables. - Make a good case for your preferred mitigation strategy based on model output. We are
modeling the disease side of things in detail, but not modeling things like economic impacts, lost
education opportunities, mental health impacts, etc. Thus, you can make qualitative arguments about these things.
Hints: 1. Your program should follow the same basic form as the IBM model in the lecture notes. That is,
there should be a matrix or data frame that tracks the state of the population, with rows corresponding to members of the population and columns corresponding to variables that you want to track. There should be a loop over time and in each time step various functions are applied to the population that implement processes such as infection, recovery, etc.
2. My population data frame has columns of disease status, age, age at infection, sex, household index, workplace/school, employee index/school grade, day of recovery if infected, day of death if they die from disease, indicator long COVID. My program does not actually use age at infection or sex, but I kept these from the previous program in case I want them in the future. You do not need to have exactly the same columns. I am telling you this to give an example of what I did.
3. You should make a function to construct your population. You will need to do this many times and you will want to experiment with different population compositions. Having a function will make this more convenient. My function is of form make_households(F0,nw,sb,lb,nsb,nlb), where F0 is the number of households, nw is the number of people who don’t work, sb is the number of small businesses with nsb employees, and lb is the number of large businesses with nlb employees. The function loops over desired households. For each one, it randomly determines the household composition (single individual, couple, couple with kids, etc) and the number of children (if there are any) based on real data for the US. I have another function that takes the values nw,sb,lb,nsb, and nlb and constructs the available jobs. It then randomly assigns adults to jobs. It does similarly with schools for children. It calculates grade in school based on age.
4. Households, jobs, and schools only matter in the model because they determine which members of the population interact with each other and therefore can infect each other. Rates of infection may vary between different settings.
5. I have separate functions for disease transmission in households, businesses, schools, and in the community. I also have a functions that implement mitigation measures such as lockdowns. Putting things in functions allows you to easily change them in the future. For example, if I decide that I want to have household disease transmission work in a different way, it just means writing a new function. Provided that the input and output is the same, I don’t need to change anything else in the program.
6. There are many different pieces to this program, but work on one thing at a time. Keep testing as you go along. Check at every step to make sure that your code is doing what it is supposed to.
7. The first thing that you probably should do is the make the function that constructs your population because you can’t test anything else until you have a correct population data frame.
8. You will get most of the data for model parameters from scientific papers. One way to find these is using Google Scholar. Search “scholar” on Google and it will give you Google Scholar. E.g. searching “us household composition” led me to the data on which I based my make_households function.
9. One advantage of using functions is that you can start with parameter values that you make up for testing and then easily plug in better values later after doing research.
10. There is partial credit! Your program does not have to be perfect to get points.