Hello, i am looking for the best way of implementing modular exponentation.
Basicly, i need to perform a*b%c thousand times.

I am familiar with strightforward method and montomery method.
Strightforward approach is diffrent from montgomery in a way, that i do compare and subtract instead of test and add. In montgomery method i will need to take half the steps compared to straightforward method.

What you suggest i do? maybe there is a better method? Is there any nice and fast way to compute m^e mod d, with d being 4096 bit long?

I need it to implement RSA on a specialized hardware, so i am not limited by implementation of anything. It just needs to be fast so it can be done in real time.

Also i will do DH exchange later, wich also use modular exponent. But (i think) approach is the same.