Basically the difference is in the decomposition of terms and what factors are introduced in a non-CRD model.

In a CRD model you only have a treatment term for the ith treatment along with the overall mean and an error (or residual) term.

In a non-CRD you have a blocking term and you can have as many other terms as a result of decomposing the remaining residual term, like for example interaction terms between different variables.

The real key to this is to understand how the residual term is being decomposed. In the absolute simplest model you only have a treatment variable to introduce that particular degree of freedom and the rest of the information is shifted into the residual term.

As you add more terms to the model, essentially what you are doing is shifting stuff outside of the residual and into the model explicitly with each new model term.

The point of doing this is look at how these errors can be decomposed to test specific kinds of statistical properties of the experiment and eventually to make comparisons between different estimated parameters much like we do in a t-test. However because of the confounding that can happen that will introduce bias into the experiment (that potentially screws things up and gives misleading output), we need to see how we can test for such things to decide whether these effects are significant enough to deviate from the assumptions we need and from what we are actually expecting.

The goal of experimental design in a statistical sense is to systematically introduce ways to find these elements of bias, test for them statistically using all the mathematical muscle, and then if they are there to think about how we can remove those so that eventually we can answer the question with some level of confidence that we started off with before even designing and performing the experiment which is usually to make a comparison of some sort (like with treatments, whether to compare those or to compare them with a non-treatment situation).