My teacher gave the definition that:

$\displaystyle P (B|A) =$$\displaystyle \frac{P(A\cap B)}{P(A)}$

yet it seems to make more sense to first define

$\displaystyle P(B|A) = $$\displaystyle \frac{n(A\cap B)}{n(A)}$ where A takes on the role of the sample space and then divide both the top and bottom of the fraction by n(S) and getting:

$\displaystyle P(B|A) = $$\displaystyle \frac{\frac{n(A\cap B)}{n(S)}}{\frac{n(A)}{n(S)}}$

which by definition of P would yield the equation at the top.

Does this make sense? Also, can anyone recommend a good probability book that does these kind of derivations?