12-14-2010, 06:55 PM
Dalibor, ATI CAL compiler wasn't smart enough to use BFI_INT, mainly because this instruction presents only at ISA level while lowest level available to programmer is IL. With hacks it was possible to use BFI_INT which brings another major speed-up for MD5 (~16%).
So thanks again for idea .
So thanks again for idea .