What is the difference between using a joint probability density function to calculate the joint probabilities of observations/samples and simply multiplying out a probability tree to find the probabilities of certain events? The way I see it, a probability tree is just a visual representation of a discreet random variable that only takes values of 1 and 0 whereas the joint probability density function enables you to work with other types of functions?

Well, if your distributions are continuous, you can toss the probability tree away, since it will have an infinite number of branches. For continuous stuff, the joint pdf is the way to go.