ARM can not reference an arbitrary offset so it needs some special
handling if the offset goes beyond abs(4095).
The code for do_spill_read and do_spill written used to be very similar.
I've partially factored out the code into spilling_create_load_store.
The former code loaded the offset from a constant pool, which is a waste
of memory-bandwidth and cache lines. The new code tries to find a way to
adjust the baseregister so the memory location can be reached more
easily, this allows us to handle at least +-1MB with just a single
additional ADD or SUB instruction. If that fails we'll resort to the normal
constant loading code, which on it's own will fallback to loading the
constant from a constant-pool.
So instead of:
ldr r1, =16388
ldr r0, [r13, r1]
which will at least uses 4 cycles (2 Instruction cycles + 2 stall
cycles) on most cores.
We try to generate:
add r1, r13, #16384
ldr r0, [r1, #4]
which most armv5+ cores will execute in 2 cycles. We'll also save on
DCache usage.
git-svn-id: trunk@21889 -
knows that the frame pointer needs to be available (and the code is also
much simpler this way), fixes test/units/system/tassert7 after r21843
git-svn-id: trunk@21869 -
If the needed adjustment is not expressible in a shifterconst, the old code
loaded a temporary register (fixed to r12) via a_load_const_reg and used it
to adjust the SP. Resulting in:
mov r12, #44
orr r12, r12, #4096
sub sp, sp, r12
The new code will try to split the adjustment into 2 shifterconstants and
will do two seperate adjustments:
sub sp, sp, #44
sub sp, sp, #4096
If that doesn't work we'll fall back to the old code. But that should
happen VERY rarely, only for stacks bigger than 256k which are not
expressible in 2 shifter constants.
git-svn-id: trunk@21863 -
result location (NR_FUNCTION_RESULT_REG is not valid on all platforms)
o this requires passing the forced function result type (if any) to this
method
o a generic, basic thlcg.a_call_name() is now available that sets the
function result location; can be called by descendants
* the availability under all circumstances of the correct function return
type enables g_call_system_proc() on the JVM platform to now determine
by itself how many stack slots are removed by the call -> do so, instead
of manually counting them (or forgetting to do so and messing up the
maximum evaluation stack height calculations)
git-svn-id: trunk@21862 -