To understand this, we need to understand the Mean Value Theorem. Basically, the Mean Value Theorem says that for any function that is continuous on an interval, the gradient of any chord on that interval will have the same gradient as the function at some point between them. In symbols: .
Now let's assume that we want to evaluate the area under a curve between two points and . To do this, we can subdivide this interval into rectangles and mark in the midpoints of each interval, i.e. . The length of each rectangle will be the difference between two successive subdivisions , and the width of each rectangle will be the value of the function at the midpoint of the interval. So the area of each rectangle is .
Now we can approximate the entire area by adding all these rectangles. So
and this approximation becomes more accurate when we make more subdivisions. So that means
Let's take another look at the Mean Value Theorem:
Notice that the area of each rectangle looks VERY similar to the LHS of this equation (i.e. the difference between two x values multiplied by a function value evaluated at a point between the two x values). And when we increase the number of rectangles and make the subdivisions extremely small, it gets to the point where the midpoint is the ONLY point in between them, so will be the value where the gradient of the function is equal to the gradient of the chord. So that means we can use the Mean Value Theorem to simplify.
So that means to evaluate the area under a curve between two points, you evaluate the difference between an antiderivative of the function at those two points.