"canonical" in this instance means "standard" (and in this instance is actually defined by a universal property).

meaning: we do it in "the same way" no matter which particular group and normal subgroup we have.

ok, let's look at it this way:

suppose i have a group G, and a homomorphism f from G to some other group G'. suppose that H is a subgroup of G, and i say "f kills H". what might i mean by that?

what i mean is: we have an obvious homomorphism of H into G defined by:

i_{H}(h) = h, for all h in H. and when i say f kills H, what i mean is:

f∘i is the 0-homomorphism (that is: it maps everything to the identity e' of G'...the term "0" is somewhat mis-leading, we mean its a "trivial" homomorphism).

now if N is a normal subgroup of G, we have a surjective homomorphism p:G-->G/N given by p(g) = gN (send every element of G to the coset of N it lives in).

p is said to be "universal among homomorphisms that kill N". well, what does THAT mean?

it means if we have a homomorphism f:G-->G', such that f(n) = e' for all n in N, f "factors through p", that is there is some OTHER homomophism f' with f = f'∘p.

so let's unravel this. if f kills N, what we are saying is N is contained in ker(f), let's call ker(f), K (for kernel).

what is this other homomorphism f'? it seems like it should be:

f'(gN) = f(g), right? does this even make sense?

well, first we need to check "well-defined-ness". that is we need to be sure that if gN = hN, that f(g) = f(h).

if gN = hN, then h^{-1}g is in N. but N is contained in K, so h^{-1}g is also in K, which means f(h^{-1}g) = e' (since K is the kernel of f).

since f is a homomorphism, f(h^{-1}g) = f(h^{-1})f(g) = [f(h)]^{-1}f(g). since this also equals e', we have:

f(g) = f(h) (multiply by f(h) on the left). so f' is indeed at least a function from G/N to G'.

to see that f' is actually a homomorphism, we compute:

f'((gN)(hN)) = f'((gh)N) = f(gh) = f(g)f(h) = f'(gN)f'(hN).

finally we check that f'∘p = f:

f'∘p(g) = f'(p(g)) = f'(gN) = f(g).

this is all pretty much the same thing you see in the first isomorphism theorem, it's not all THAT exciting. so why bother?

well...because proving things about N as a subgroup of G, and about G/N as a factor group (quotient group) usually depends on (like we did above) picking "typical elements" g,h in G and seeing what happens to them.

but saying: p is universal among homomorphisms that kill N isn't a statement about ELEMENTS of G, its a statement about our "canonical" homomorphism p. in other words, instead of looking at the OBJECT G/N, we're looking at the MAPPING p.

and we're saying: "if a homomorphism (any homomorphism) kills N, it has to "go through p" first. p is "special" or "distinguished" among all the homomorphisms that kill N, it's OPTIMAL (it kills N, and ONLY N, not anything more).

that is: given the problem of a group G, and a normal subgroup N, how do we make a homomorphism that "just kills N"? form the factor group (apply the homomorphism p) G/N.

this construction occurs so frequently, we don't even bother giving p "a name", because we do it "the same way" for every group G and any normal subgroup N.

in this light, what the first isomorphism theroem says is:

there is no practical difference between a factor group G/N and a homomorphic image of G. we can use either "construction" as it suits us. sometimes the quotient group G/N is easier to understand, sometimes f(G) is easier to deal with.

so rather than carry the letter p around and explicitly define it (for a particular subgroup N, and a particular group G), we just say:

consider the canonical homomorphism G-->G/N, which you are expected to know means "p".