I *do* see your problem.

What part (a) establishes is: if there is a function $f_x(y) \stackrel{\text{def}}{=} \text{sum}(x,y)$ such that:

P1) $f_x(1) = x'$

P2) $f_x(y') = (f_x(y))'$

There is only ONE such function. But is there such a function at all? It's not clear that there is (we know there "actually" is, but if we were doing this FOR THE FIRST TIME, we might not).

So we now focus on the SET:

$S = \{x \in \mathfrak{N}: \exists \text{ such a } f_x\}$

This set, for all we know, might be empty. A natural question is: is $1 \in S$?

Well, one function we KNOW exists, is the successor function. So let's try: $f_1 = s$, that is: $f_1(y) = y'$. Does this satisfy (P1) and (P2)?

(P1) states that we should have s(1) = 1', which is true (don't over-think this).

(P2) says that the following should be true:

$s(y') = (s(y))'$, which is also true since both sides are $s(s(y)) = (y')'$.

So we DEFINE $f_1(y) = y'$, which clearly exists, and since we can only have (at most) ONE such function, and there is INDEED one, that is "the one and only" (for $x = 1$, at least).

So $1 \in S$, which shows that $S$ is non-empty.

Well, now the baby's shoulders are out, and the rest comes pretty easily.

Suppose $x \in S$ (we can do this now, since at least the natural number 1 is in $S$). So now we can take the EXISTENCE of $f_x$ as given. What we want to do now, is show that $x'$ is likewise in $S$.

So, given $f_x$, how shall we come up with a candidate for $f_{x'}$? SInce the only functions we have to play with are $s$ (the successor function) and $f_x$ (which might be, for all we know, $f_1$ and only $f_1$), we do what comes naturally with two functions: we COMPOSE them.

In other words, our candidate for $f_{x'}$ will be:

$s\circ f_x$. We have to verify that this candidate is VIABLE, that it satisfies (P1) and (P2). Now (P1), in this scenario, says:

$(s \circ f_x)(1) = (x')'$

Unraveling this, we get: $(s \circ f_x)(1) = s(f_x(1)) = s(x')$. since BY ASSUMPTION $f_x$ satisfies (P1). Since by definition, $s(x') = (x')'$, we see (P1) does indeed hold.

Ok, now we tackle (P2):

$f_{x'}(y') = (s \circ f_x)(y') = s(f_x(y')) = s((f_x(y))')$ (since $f_x$ satisfies (P2))

$= ((f_x(y))')' = (s(f_x(y)))' = ((s \circ f_x)(y))' = (f_{x'}(y))'$, which establishes (P2) (I apologize for the horrendous amount of parentheses).

Thus $S = \mathfrak{N}$.

We haven't invoked commutativity, nor even shown it, at this point. We have only used properties of FUNCTIONS, and functional composition is not generally commutative.

But if you look closely, what (P2) actually says is that $f_x$ and $s$ commute:

$f_x(y') = (f_x \circ s)(y)$, while $(f_x(y))' = (s \circ f_x)(y)$.

And THAT is going to be the basis for commutativity in general.

*****************

As I have remarked before, the author of your text is "notationally suppressing" the functional nature of successorship, which can make the order in which he is evaluating expressions unclear. When he writes:

$x + y'$

the successor function is applied FIRST, the summing is done LATER. When he writes:

$(x + y)'$

it is the opposite, the summing is done first, the successor is taken after. Writing these explicitly as functions, as I have done, makes "which is inside what" more EXPLICIT, although the parentheses become a bother after a layer or two.

Note also, then when he says: "...for a FIXED $x$" (emphasis mine), you are supposed to see the "implied currying" of a function of two variables into two functions of a SINGLE variable:

$\text{sum}(x,y) = f_x(y)$.

I suspect the author is trying to "keep the material familiar", and in so doing, he obscures some of the underlying logic. The "intermediate function":

$x \to f_x$ is "hidden from view".

Logically, the two are equivalent (but not "equal"), as there is a one-to-one correspondence between the two ways of expressing this.

I hope that by writing $f_1$, you see that we are accomplishing the same thing as SHOWING that:

$s(x) = x+1 = 1+x$ ALL refer to the same UNIQUELY DEFINED function of $x$.

A side note:

Another approach to this same thing is to use "double induction" by showing x+y is defined uniquely for a GIVEN x and inductively on y, and then inductively on x. This is rather like "nesting" in computer programming.