03-30-2012, 10:49 PM
atom, did you try vectorized kernels like those for sm_21? From what I understood, 680GTX may turn out (from OpenCL perspective) something very much resembling a VLIW card, just in another way. I mean at least uint4 for best resource utilization