permutation matrices do more than "swap rows". a more apt description would be "shuffle rows", or "permute rows".

so a better proof of (1) says the k-th row of PσA = the k-th row of A, hence PσA = A, for all A (important!), because only then can you conclude Pσ = I (because the multiplicative identity I of nxn matrices is unique).

(2) same problem. Pτ takes the k-th row of A to the τ(k)-th row of PτA, and then Pσ takes that to the σ○τ(k)-th row of PσPτA. the rows aren't "swapped" we can't say that the τ(k)-th row of A is taken by Pτ to the k-th row of A.

(3) looks fine.

(4) this is not true. only the other entries in the j-th row and the i-th column of Pσ^-1 will be 0. you need to argue row-by-row, or column-by-column, since Pσ has a 1 in EACH row and EACH column. that is, the j-th column of Pσ is ei, the i-th basis vector. so the transpose has the i-th basis vector in the j-th row.

clearly, when you multply Pσ by its transpose, the only non-zero entries of the product will be on the diagonal, when i = j, because the i,j-th entry of Pσ(Pσ^T) is <ej,ei>, which is what i think you meant. since the ej form an orthonormal basis, this will be the identity matrix I, so the inverse of Pσ is its transpose.

all of this assumes that the columns of Pσ are σ(ei) for each i. if the rows of Pσ are σ(ei), then PσPτ = Pτσ instead of (2), but all the other properties hold. sometimes permutation matrices are defined this way, because sometimes algebraists use τσ to mean do τ first, then σ (so they write (x)τσ).

the point is, that σ-->Pσ defines an monomorphism of Sn into GLn(F). the orthogonality of the Pσ is a nice bonus, geometrically it means all versions of F^n have the same geometric properties, the labelling of the basis vectors doesn't matter that much (well, almost. odd permutations of Sn reverse orientation).