fpc/compiler/arm
florian c75486db89 * patch by Nico Erfurth:
Reorder unaligned Load sequence on ARM

The old version produced code like that:

ldrb rDEST, [rBASE]
ldrb rTemp, [rBASE, #1]
orr  rDEST, rDEST, rTEMP lsl #8 (2 stall cycles)
ldrb rTemp, [rBASE, #2]
orr  rDEST, rDEST, rTEMP lsl #16 (2 stall cycles)
ldrb rTemp, [rBASE, #3]
orr  rDEST, rDEST, rTEMP lsl #24 (2 stall cycles)

This creates a lot of stall-cycles on ARM Implementations with load
delay slots like Marvel Kirkwood or Intel XScale. With the usual up to 2
stall-cycles this code requires a total of 13 cycles (7 instructions + 6 stall
cycles) in best case.

The new code uses a second temp register to avoid the stall cycles.

ldrb rDEST, [rBASE]
ldrb rTemp1, [rBASE, #1]
ldrb rTemp2, [rBASE, #2]
orr  rDEST, rDEST, rTEMP1 lsl #8
ldrb rTemp1, [rBASE, #3]
orr  rDEST, rDEST, rTEMP2 lsl #16
orr  rDEST, rDEST, rTEMP1 lsl #24 (1 stall cycle)

The rescheduling and second register bring the total cycles down to 8.
If a later rescheduling should happen for the last orr it even can go
down to 7.

git-svn-id: trunk@21363 -
2012-05-22 19:09:20 +00:00
..
aasmcpu.pas * skip comments properly when searching for places for constant pool distances 2012-05-15 18:08:19 +00:00
agarmgas.pas + generic implementation of ReplaceForbiddenAsmSymbolChars() instead 2012-04-11 18:01:57 +00:00
aoptcpu.pas * patch by Nico Erfurth: 2012-05-17 08:31:44 +00:00
aoptcpub.pas * typo fixed 2012-03-11 08:24:44 +00:00
aoptcpuc.pas
aoptcpud.pas
armatt.inc + patch by Bernd to add the push/pop mnemonic for arm/thumb-2, resolves #22041 2012-05-15 18:52:09 +00:00
armatts.inc + patch by Bernd to add the push/pop mnemonic for arm/thumb-2, resolves #22041 2012-05-15 18:52:09 +00:00
armins.dat + patch by Bernd to add the push/pop mnemonic for arm/thumb-2, resolves #22041 2012-05-15 18:52:09 +00:00
armnop.inc + support for nop, msr and mrs instructions 2009-01-26 14:18:42 +00:00
armop.inc + patch by Bernd to add the push/pop mnemonic for arm/thumb-2, resolves #22041 2012-05-15 18:52:09 +00:00
armreg.dat + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
armtab.inc + support for nop, msr and mrs instructions 2009-01-26 14:18:42 +00:00
cgcpu.pas * patch by Nico Erfurth: 2012-05-22 19:09:20 +00:00
cpubase.pas * patch by Nico Erfurth: 2012-05-17 08:03:51 +00:00
cpuinfo.pas + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
cpunode.pas * the objc1 unit has been renamed to objc 2009-09-27 15:24:50 +00:00
cpupara.pas * use correct result registers for in64 results on armbe, resolves #21731 2012-04-20 18:07:06 +00:00
cpupi.pas * use r7 instead of r11 as frame pointer on Darwin/iOS, and make sure r7 2012-03-29 20:54:33 +00:00
cputarg.pas + some generic changes preparing for darwin/arm support 2008-10-02 15:10:13 +00:00
hlcgcpu.pas * create/destroy also the high level code generator for all architectures, 2011-08-20 07:21:16 +00:00
itcpugas.pas
narmadd.pas + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
narmcal.pas + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
narmcnv.pas * moved subsetref/reg and bit_set/test support from cgobj to hlcgobj for 2012-05-13 12:33:10 +00:00
narmcon.pas * fixed ARM and MIPS compilation after r14912 2010-02-18 21:19:17 +00:00
narminl.pas + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
narmmat.pas * moved subsetref/reg and bit_set/test support from cgobj to hlcgobj for 2012-05-13 12:33:10 +00:00
narmset.pas * converted tcgcasenode.pass_generate_code() to hlcgobj 2011-08-20 07:48:33 +00:00
pp.lpi.template * improved template with help from Mattias Gaertner 2006-08-28 20:29:04 +00:00
raarm.pas o patch by Jeppe Johansen to fix mantis #17472: 2010-12-24 15:54:39 +00:00
raarmgas.pas + support for REV and several other ARMv6/ARMv6T2+ opcodes (mantis #21888) 2012-05-13 12:14:26 +00:00
rarmcon.inc + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
rarmdwa.inc o added ARM VPFv2/VFPv3 support: 2009-12-03 22:46:30 +00:00
rarmnor.inc o added ARM VPFv2/VFPv3 support: 2009-12-03 22:46:30 +00:00
rarmnum.inc + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
rarmrni.inc + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
rarmsri.inc + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
rarmsta.inc o added ARM VPFv2/VFPv3 support: 2009-12-03 22:46:30 +00:00
rarmstd.inc + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
rarmsup.inc + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
rgcpu.pas * patch by Jeppe Johansen to avoid corruption of frame/stack pointer by pre/post indexed operations, resolves #19679 2011-08-16 22:43:30 +00:00