My teacher gave the definition that:
yet it seems to make more sense to first define
where A takes on the role of the sample space and then divide both the top and bottom of the fraction by n(S) and getting:
which by definition of P would yield the equation at the top.
Does this make sense? Also, can anyone recommend a good probability book that does these kind of derivations?