Hey Kyo.

I don't have an actual answer or proof for you but I know these results have been proved in multi-linear or tensor algebra.

The intuition for me is that you are doing a composition of functions and that you map one thing to something with dim(X) and then map each element to its dimension of dim(Y) for each component just like a composition of functions.

I guess what you could show is that you start with specific vectors of certain length and if you have z vectors in your matrix with n linearly independent vectors, then the total dimension if every element of each vector is independent will have z*n independent components.

Now consider the multiplication of the matrices in that both matrices are independent which means that the dimension doesn't reduce for the shared system (i.e. no linear dependencies between the entries of both matrices) and the cardinality of the dimensions is multiplied since you are taking the permutation of the element in one system with another (like a Cartesian product of sorts but the (a,b) denotes a from one matrix and b from the other).

Again formalizing this would be done in tensor analysis and multi-linear matrix products but the intuition is the same as the cardinality idea of the Cartesian product.