assume we know

100001001 mod P = a

how can we calculate efficiently:

000010010 mod P = ?

(MSB 1 was removed, LSB 0 was added)

000010011 mod P = ?

(MSB 1 was removed, LSB 1 was added)

In what's below, when I use single quotes for 'mod,' I mean the operator that returns the common residue. Without single quotes, mod is used in the usual way for math textbooks.

Well if we know 000010010 'mod' P, then 000010011 'mod' P is easy because it's just the first result plus 1.

(Unless the first result is P-1, then the second value is 0.)
Is it right to assume that \(\displaystyle n\) occupies all 4096 bits? If so, then the MSB being 1 signifies a summand of \(\displaystyle 2^{4095}\). If not, it's not hard to adjust according to the location of the MSB.

But assuming all 4096 bits are necessary to store \(\displaystyle n\), then \(\displaystyle n' = 2\cdot(n-2^{4095})\), if we let the new LSB = 0. So we want to find

\(\displaystyle n' \text{ `mod' }p = \left(2n-2^{4096}\right) \text{ `mod' }p\)

In other words,

\(\displaystyle 2n - 2^{4096} \equiv\ ?\ (\text{mod }p)\)

Depending on the size of \(\displaystyle p\), we can use Fermat's little theorem to reduce the size of the exponent 4096, and afterwards we can use fast exponentiation as described in my first post. If p > 4097, then I believe it will take \(\displaystyle 11 = log_2(4096)-1\) steps to get \(\displaystyle 2^{4096}\ (\text{mod }p)\), just by successively squaring the current result. But if you're doing this operation a lot, you can store the value(s) in a table.

Does all this seem reasonable to you?

(Edit: Added the part in green for correctness, since we need common residue.)