ARM asm ::: well finally i had to, there were nasty c codes and even nasty c++, but after trying everything i had to drop in and see what really an ARM processor can does. asm asm, here i go!
::: until then read some other ARM hacks done by leachbj on ipodlinux (4th gen ipod with linux). quite nice stuffs, and asm codes like this:

static inline INT_32 fixp_mul_32s_nX( INT_32 x, INT_32 y, UNS_8 n ) {
INT_32 res, tmp;
__asm__ __volatile__ (
"smull %0, %1, %3, %4 \n\t"
"movs %0, %0, lsr %2 \n\t"
"rsb %2, %2, #32 \n\t"
"adc %1, %0, %1, lsl %2 \n\t"
: "=&r" (tmp), "=&r" (res), "+r" (n)
: "r" (x), "r" (y)
);
return res;
}