I've got some data on an epidemic in various locations - the total number of agents and number killed by the infection after 1 year. -This gives gives me a distribution of percentages of the populations that have been killed by the infection. (but all the percentage values are relatively small)
I wrote a mathematical ODE model for the disease spread within a population with 3 free parameters:
p1 - probability of getting infected externally from the environment
p2 - probability of infecting a new agent once at least one is already sick
p3 - once an agent dies, it is replaced with a new one, the probability that the new one is already infected is given by p3.
Now I need to choose values for p1,p2 and p3 so that the model generates data distributed as closely to the original distribution as possible.
The trouble is that I have never done anything like this before and have very little experience with any sort of statistics.
How should I define the original data distribution - a list of percentages of killed agents? a continuous function somehow?
Then should I choose values for p1,p2,p3 by trial and error and run simulations multiple times to also generate distributions of data?
Lastly, is there a proper way of comparing the obtained data with the original set? I've seen somewhere something about distance functions, what would be the best way of implementing this?
Thanks for any advice!