I'm porting a small academic OS from TriCore to ARM Cortex (Thumb-2 instruction set). For the scheduler to work, I sometimes need to JUMP directly to another function without modifying the stack nor the link register.
On TriCore (or, rather, on tricore-g++), this wrapper template (for any three-argument-function) works:
template< class A1, class A2, class A3 >
inline void __attribute__((always_inline))
JUMP3( void (*func)( A1, A2, A3), A1 a1, A2 a2, A3 a3 ) {
typedef void (* __attribute__((interrupt_handler)) Jump3)( A1, A2, A3);
( (Jump3)func )( a1, a2, a3 );
}
//example for using the template:
JUMP3( superDispatch, this, me, next );
This would generate the assembler instruction J (a.k.a. JUMP) instead of CALL, leaving the stack and CSAs unchanged when jumping to the (otherwise normal) C++ function superDispatch(SchedulerImplementation* obj, Task::Id from, Task::Id to).
Now I need an equivalent behaviour on ARM Cortex (or, rather, for arm-none-linux-gnueabi-g++), i.e. generate a B (a.k.a. BRANCH) instruction instead of BLX (a.k.a. BRANCH with link and exchange). But there is no interrupt_handler attribute for arm-g++ and I could not find any equivalent attribute.
So I tried to resort to asm volatile and writing the asm code directly:
template< class A1, class A2, class A3 >
inline void __attribute__((always_inline))
JUMP3( void (*func)( A1, A2, A3), A1 a1, A2 a2, A3 a3 ) {
asm volatile (
"mov.w r0, %1;"
"mov.w r1, %2;"
"mov.w r2, %3;"
"b %0;"
:
: "r"(func), "r"(a1), "r"(a2), "r"(a3)
: "r0", "r1", "r2"
);
}
So far, so good, in my theory, at least. Thumb-2 requires function arguments to be passed in the registers, i.e. r0..r2 in this case, so it should work.
But then the linker dies with
undefined reference to `r6'
on the closing bracket of the asm statement ... and I don't know what to make of it. OK, I'm not the expert in C++, and the asm syntax is not very straightforward... so has anybody got a hint for me? A hint to the correct __attribute__ for arm-g++ would be one way, a hint to fix the asm code would be another. Another way maybe would be to tell the compiler that a1..a3 should already be in the registers r0..r2 when the asm statement is entered (I looked into that a bit, but did not find any hint).