trying to translate MLE formula into words

What is the right narrative (ie in words, not in symbols) of this definition of a maximum likelihood estimator (Penzer, LSE):

$\displaystyle L_Y(\theta; y)=sup_{\theta\in\Theta}L_Y(\theta,Y)$

Let me try.

We have a sample $\displaystyle Y=Y_1,...Y_n$. This sample is parametrised by a parameter $\displaystyle \theta=(\theta_1, ... \theta_n)$ which can take values in a parameter space $\displaystyle \Theta$.

Then we have a set of all possible likelihood estimators given that sample and that parameter. Then the maximum likelihood estimator MLE $\displaystyle \theta$"hat" is the least upper bound of this set of all possible likelihood estimators.

is that right? I want to check that I understand the notation fully.

Also, is this the same as another definition of MLE (Casella and Berger): theta hat is a parameter value at which the likelihood function attains its maximum as a function of theta. Why is here a clear cut 'maximum' while in the above definition it is a 'supremum' - does that mean that there is a possibility that this value is not a part of the set of all likelihood estimators?