Hey Baillya.

One suggestion I have for you is when you have conditional probabilities or conditional moments, you want to specify your conditioning variable in terms of something that is specific.

So instead of f(X|Z), you need to have a specific subset of Z whether it be a specific value like Z=z or some custom subset like Z < 0 or (Z > 4 AND Z < 5) or something else. Without this information, it is essentially just a normal joint distribution and not really something is conditional per se.

So for your f(Z), when you look at a conditional probability where Z has a known subset and this probability will give you something that is explicit (i.e. you can evaluate it to get a probability between 0 and 1 inclusive) and this is used to get rid of variables involving z to give a distribution for x constrained on some subset for z which should give a pdf with only x in it.

To change this let change the f(Z) to f(Z = z) where z is some subset of the whole space associated with the random variable Z. If this subset is a simple region then you represent this with an integral (or a summation for discrete) and this can be evaluated to a number, and handle the other marginal distribution for f(X|Z = z) accordingly.

It is a subtle point but it's important because each distribution will have some kind of variation and the minute you go from many degrees of variation to less, then the PDF has to reflect this and if you don't reflect this then it means that you will probably screw the rest of the analysis up.

Every time you condition a random variable you are looking at a specific subset of that random variable given some condition and this becomes a new distribution of its own: it's like drawing a Venn diagram that is a big circle and drawing a smaller circle within it: the large circle is unconstrained (or less constrained) and the small circle is constrained representing your conditional distribution.