As far as there are two trials and y is observed number of heads likelihood function looks as follows:

L(p)=C^y_2*p^y*(1-p)^(2-y)

To obtain estimate based on likelihood maximize L() over two possible values for p=1/4 or 3/4. For

y=0 you have L(p)=(1-p)^2 which is maximized by p=1/4. For y=1 we have L(p)=2*p*(1-p) is to be maximized

by both p=1/4 and p=3/4. Analogously for y=2 the p=3/4 would be maximum likelihood estimate.

Second is invariance which is one of basic properties of maximum likelihood estimates. If L(teta) is likelihood function

then L'=L( t^(-1) (beta) ) is likelihood function for beta. As far as teta' is maximum of L the beta'=t(teta') is maximum of L'.

As a consequence of invariance you have that ML estimate is generally biased.