The choice of best fit criteria is really dependent on what you want to
do, but if y(t_1) ... y(t_n) are what the model predicts at times t_1, .. t_n,
and o(t_1), .. , o(t_n) are the observed values, then the model which
minimises:
SSR = (o(t_1)-y(t_1))^2 + (o(t_2)-y(t_2))^2 + ... + (o(t_n)-y(t_n))^2
would be considered the best (subject to a number of considerations about
how much noise there is in the measurement process, and how many
parameters you have to estimate in the model from the data).
You should be aware that you can put a polynomial of degree (n-1)
through any n data points of this type. So perfection is possible but
is overkill, and will result in a poor model.
RonL

