If you could, please open up the following link

http://www.math.dartmouth.edu/~prob/prob/prob.pdf

My question is on pg 70, example 2.13. I don't understand how they made the move from P(U less than or equal to sqrt{x}) to sqrt{x}.

For the same example, I don't understand why the graph of the probability density function (f(x)) is so different from the distribution function (F(x)). This is Figure 2.13. I thought that the cumulative distribution function is just the total area under the density function. So why wouldn't we only have one graph (just the density function graph) and say that the total area under the curve is the cumulative distribution function.

Could you explain how given some information (other than the density function), how I could derive the distribution function? For example

The experiment is to toss two balls into four boxes in such a way that each ball is equally likely to fall in any box. Let X denote the number of balls in the first box.

What is the cumulative distribution function of X?

Please help! I feel like I'm so close to understanding this!