I'm trying to understand a paper which shows a linear model:
where is an vector, is an matrix, and is an vector. Additive Gaussian noise is represented by with variance .
I'm trying to understand how it is that the data log likelihood has this form:
There is no indication of how the log likelihood in eqn (2) follows from (1), and I've seen it in a few papers already, so I'm assuming it's standard prerequisite knowledge of probability. I'm having trouble finding sources where I could look this kind of thing up.
Another confusing aspect, is that has dimensions , so how can you calculate its square, and how can the log likelihood be a scalar?