fpc/rtl/arm
masta 64c122100f Small optimizations to FillChar for ARM
The new version is more optimized to the "common case"

We assume most of the data will be aligned, thats why the unaligned
case has been moved to the end of the function so the aligned case is
more cache- and pipeline friendly.

I've also reduced the loop unrolling for the block transfer loop,
because for large blocks we'll most likely hit the write buffer limit
anyway.

I've did some measurements. The new routine is a bit slower for less
than 8 bytes, but beats the old one by 10-15% with 8 bytes++

git-svn-id: trunk@21760 -
2012-07-02 23:54:19 +00:00
..
arm.inc Small optimizations to FillChar for ARM 2012-07-02 23:54:19 +00:00
divide.inc Use bx lr in ARM-RTL for armv5 2012-06-18 16:59:39 +00:00
int64p.inc
makefile.cpu
math.inc + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
mathu.inc + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
mathuh.inc
set.inc
setjump.inc * fix longjmp for -Cparmv7m, resolves #22014 2012-05-15 18:56:27 +00:00
setjumph.inc + support for the ARM hard float EABI on Linux (patch by Peter Green): 2012-03-29 20:50:09 +00:00
strings.inc ARM assembly versions of strupper and strlower 2012-06-18 16:59:34 +00:00
stringss.inc
thumb2.inc * Promoted result type of FPC_PCHAR_LENGTH and FPC_PWIDECHAR_LENGTH to SizeInt. 2011-06-13 04:59:17 +00:00