What kind of distribution model did you have in mind? Is it just a standard multinomial?
I'm use the maximum entropy principle to estimate the probability of observing a rare event. For example, if I rolled a fair, six sided, die I know the probability of rolling any number is 1/6 and the expected outcome as 3.5. However, if I observed something rare, say , I want to use the large derivation/maximum entropy principle and gradient descent to determine the a gibbs representation of the empirical distribution (the frequency of each event).
Would be grateful for any help.