Commit Graph

471 Commits

Author SHA1 Message Date
florian
64ac48c815 * patch by Nico Erfurth: Better support for PLD on ARM
git-svn-id: trunk@21572 -
2012-06-09 17:28:05 +00:00
florian
3db61ae52d * patch by Nico Erfurth: Reworked regLoadedWithNewValue
Added better support for A_STR, A_LDR, A_STM, A_LDM.

Reworked the code the use a case statement for better readability.

git-svn-id: trunk@21571 -
2012-06-09 17:27:30 +00:00
florian
03a30ff036 * patch by Nico Erfurth: Remove STRH and STRB from instructionLoadsFromReg
STRH and STRB are not handled as sperate instructions by the code
generator.

git-svn-id: trunk@21570 -
2012-06-09 17:26:06 +00:00
florian
7599de416d * patch by Nico Erfurth: Reworked MatchOperand in ARM Peephole Optimizers
Added top_ref comperator which uses RefsEqual.
Reworked the code for easier readability by using a case statement.

git-svn-id: trunk@21569 -
2012-06-09 17:25:32 +00:00
florian
6e8594a9af * patch by Nico Erfurth: Minor fix for FoldShiftProcess peephole optimizer on ARM
Use UpdateUsedRegs and drop the check for reloading of the register, as
this is done in RegUsedAfterInstruction now.

git-svn-id: trunk@21520 -
2012-06-07 18:21:46 +00:00
florian
5b02a7cb9b * patch by Nico Erfurth: Check for register reloading in RegUsedAfterInstruction on ARM
This slightly changes the semantics of RegUsedAfterInstruction.
We now check if the `current value` of the register will be used later.
It will do `the right thing` for all the normal use cases.

git-svn-id: trunk@21519 -
2012-06-07 18:20:35 +00:00
florian
45c70ec81c * patch by Nico Erfurth: Support the usage of BIC instead of AND on ARM
BIC clears the specified bits, while AND keeps them. The usage of BIC
allows a broader range of shifterconsts to be used on the ARM cpu, often
saving a cycle.

Previously code like:
Data:=Data and $FFFFFF00

would result in

mvn r1, #255
and r0, r0, r1

This patch changes this to

bic r0, r0, #255

git-svn-id: trunk@21510 -
2012-06-06 19:45:26 +00:00
florian
fefc130efc * patch by Nico Erfurth: Handle BIC properly in taicpu.spilling_get_operation_type
BIC was handled as a read only operation, which caused it to overwrite
live register content sometimes.

git-svn-id: trunk@21509 -
2012-06-06 19:44:53 +00:00
florian
8cae4c9f23 * patch by Nico Erfurth: Fix for MovStrMov Peephole optimizer on ARM
The loop checked for the wrong instruction for .opcode = A_STR. Making
the whole optimizer non functional but at least not destructive.

git-svn-id: trunk@21508 -
2012-06-06 19:44:20 +00:00
florian
83fb4c289d * patch by Nico Erfurth: Implement FoldShiftProcess Peephole optimizer for ARM
This optimizer folds shift/roll operations into following data
instructions.

It will change code like:

mov r0, r0, lsl #16
add r1, r0, r1

into

add r1, r1, r0, lsl #16

Source registers will be reordered when necessary, also SUB/SBC will be
replaced with RSB/RSC and vice versa when reordering is required.

It could be expanded to support more operations like LDR/STR.

git-svn-id: trunk@21507 -
2012-06-06 19:43:36 +00:00
florian
5393efb128 * patch by Nico Erfurth: Support A_MOV and A_MVN in RedundantMovProcess
This changes the ARM Peephole optimizer RedundantMovProcess to also
recognize and modify something like the following sequence.

mov r0, r1
mov r0, r0, lsl #8

this would be changed into

mov r0, r1, lsl #8

git-svn-id: trunk@21506 -
2012-06-06 19:43:05 +00:00
florian
4ea1d22c5a * patch by Nico Erfruth: Support BX for function returns on armv5+
BX is supported from ARMv4T onwards, but i don't have a armv4t device to
test it.

Using BX instead of mov pc,lr allows for a better pipeline utilization
by enabling the CPUs branch predictor to work properly.

git-svn-id: trunk@21505 -
2012-06-06 19:42:26 +00:00
florian
3ae5fc8c04 * patch by Nico Erfurth: adds a check for SM_ASR to also support removal of unnecessary sign extension before STRH.
git-svn-id: trunk@21446 -
2012-05-31 20:24:48 +00:00
florian
4f273aa08d * patch by Nico Erfurth: Handle STR*/LDR* properly in ARM Peephole optimizers
git-svn-id: trunk@21444 -
2012-05-31 17:00:19 +00:00
florian
fbc77b74c2 * patch by Nico Erfurth to remove superfluouse moves
git-svn-id: trunk@21422 -
2012-05-28 21:58:06 +00:00
florian
c348b6f2cc * patch by Nico Erfurth:
- Support MLA and MUL in DataMov2Data
- SMLAL and UMLAL are also reading from oper[0]
- UMLAL, UMULL, SMLAL and SMULL are writing to oper[1]

git-svn-id: trunk@21421 -
2012-05-28 18:11:31 +00:00
florian
9e180fb318 * remove unneeded zero extensions from 16 to 32 Bit
git-svn-id: trunk@21404 -
2012-05-28 07:21:27 +00:00
florian
21b94f675f + add for MLA the same register interferences as for MUL
* register interferences for MUL/MLA are only needed for less than ARMv6

git-svn-id: trunk@21385 -
2012-05-24 19:14:58 +00:00
florian
638d0d49c0 + take advantage of the mla instruction when calculating array offsets
git-svn-id: trunk@21375 -
2012-05-23 20:48:26 +00:00
florian
c75486db89 * patch by Nico Erfurth:
Reorder unaligned Load sequence on ARM

The old version produced code like that:

ldrb rDEST, [rBASE]
ldrb rTemp, [rBASE, #1]
orr  rDEST, rDEST, rTEMP lsl #8 (2 stall cycles)
ldrb rTemp, [rBASE, #2]
orr  rDEST, rDEST, rTEMP lsl #16 (2 stall cycles)
ldrb rTemp, [rBASE, #3]
orr  rDEST, rDEST, rTEMP lsl #24 (2 stall cycles)

This creates a lot of stall-cycles on ARM Implementations with load
delay slots like Marvel Kirkwood or Intel XScale. With the usual up to 2
stall-cycles this code requires a total of 13 cycles (7 instructions + 6 stall
cycles) in best case.

The new code uses a second temp register to avoid the stall cycles.

ldrb rDEST, [rBASE]
ldrb rTemp1, [rBASE, #1]
ldrb rTemp2, [rBASE, #2]
orr  rDEST, rDEST, rTEMP1 lsl #8
ldrb rTemp1, [rBASE, #3]
orr  rDEST, rDEST, rTEMP2 lsl #16
orr  rDEST, rDEST, rTEMP1 lsl #24 (1 stall cycle)

The rescheduling and second register bring the total cycles down to 8.
If a later rescheduling should happen for the last orr it even can go
down to 7.

git-svn-id: trunk@21363 -
2012-05-22 19:09:20 +00:00
florian
5f0bcd9248 * patch by Nico Erfurth:
Optimize ARM OP_MUL/OP_IMUL for x*ispowerof2(const+1) cases

Calculations like a*7 can be optimized to a*8-a with the usage of RSB and left
shifts which can be done in a single cycle.

git-svn-id: trunk@21351 -
2012-05-20 20:50:04 +00:00
florian
05a8783b1e * patch by Nico Erfurth:
Improve ARM-Peephole Optimizers

1.) Introduce a ARM-specific RegUsedAfterInstruction which analyzes
instructions and reg allocation information to see if a register is
really needed afterwards to decide if some special optimizations can be
done.

2.) Introduce "RemoveSuperfluousMove"
This tries to fold mov into a previous Data-Instruction (ADD, ORR, etc)
or LDR-Instruction.

3.) Introduce new Optimizer "DataMov2Data" and modify LdrMov2Ldr to use
RemoveSuperfluousMove

4.) Expand Ldr* and Str* Optimizers to also work on {Ldr,Str}{,b,h}

git-svn-id: trunk@21314 -
2012-05-17 08:31:44 +00:00
florian
798c9340cc * patch by Nico Erfurth:
Inline a couple of small functions of the ARM-Compiler

These small changes improved overall compile times of the fpc suite by
about 2-3% running on an 1.2GHz Kirkwood.

git-svn-id: trunk@21312 -
2012-05-17 08:03:51 +00:00
florian
b2813abec2 + patch by Bernd to add the push/pop mnemonic for arm/thumb-2, resolves #22041
git-svn-id: trunk@21310 -
2012-05-15 18:52:09 +00:00
florian
2560266e5d * skip comments properly when searching for places for constant pool distances
git-svn-id: trunk@21307 -
2012-05-15 18:08:19 +00:00
florian
748694a325 * fixes some issues with reg. allocation information
git-svn-id: trunk@21303 -
2012-05-15 18:06:41 +00:00
Jonas Maebe
edd42aa42a * moved subsetref/reg and bit_set/test support from cgobj to hlcgobj for
future use by high level code generator targets
   o this in turn required that all a_load*_loc* methods are called via
     hlcg rather than via cg, since a location can be a subsetref/reg and
     and those are no longer handled in tcg
   o that then required moving several force_location_* routines into
     thlcg because they use a_load_loc*, but did not take tdef size
     parameters (which are required by the thlcg a_load_loc* routines)
   o the only practical consequence is that from now on, you have to
     use hlcg.location_force_mem/reg() (fpureg not yet) and
     hlcg.gen_load_loc_cgpara() instead of the removed versions from ncgutil,
     and hlcg.a_load*loc*() instead of cg.a_load*loc* if a subsetref/reg
     might be involved

git-svn-id: trunk@21287 -
2012-05-13 12:33:10 +00:00
Jonas Maebe
ef2d665a50 + support for REV and several other ARMv6/ARMv6T2+ opcodes (mantis #21888)
git-svn-id: trunk@21285 -
2012-05-13 12:14:26 +00:00
Jonas Maebe
85a3fd3357 + ossinttype/osuinttype defs that correspond to OS_SINT/OS_INT for use in
the high level code generator

git-svn-id: trunk@21279 -
2012-05-12 16:03:15 +00:00
florian
77ae218556 * safer calculation of pool placement on arm
git-svn-id: trunk@21226 -
2012-05-04 19:10:30 +00:00
Jonas Maebe
834026bfb5 * synchronised with trunk up to r21067
git-svn-id: branches/jvmbackend@21068 -
2012-04-26 21:24:20 +00:00
florian
2959d596f9 * patch by Nico Erfurth: Remove superfluous mov from MovStrMov sequences
git-svn-id: trunk@21067 -
2012-04-26 20:31:13 +00:00
florian
aa2a9dbf2e patches by Nico Erfurth to improve the arm peephole optimizer:
* Introduce MatchInstruction and MatchOperand

MatchInstruction allows to match an instruction by condition and
oppostfix. MatchOperand checks if an operand is a register and matches
another operand. In the future this could be overloaded with other
versions not only accepting TRegister.

* Optimize cmp,moveq,movne sequence on ARM

This patch implements an peephole optimizer for the following sequence:

  cmp   reg,const1
  movne reg,const2
  moveq reg,const1

* Small improvements to the ARM peephole optimizer

Most instructions in the ARM ISA have taicpu(p).oper[0]^.typ = top_reg
as the only option, so there is no need to check for it if we're
looking at those instructions.

* Remove redundant mov instructions on ARM

This is an addition to the ARM PeepHole Optimizer.
It folds code like this:

mov reg1, reg2
add reg1, reg1, (const|reg)

git-svn-id: trunk@21024 -
2012-04-24 18:25:19 +00:00
florian
532102d3fa * use correct result registers for in64 results on armbe, resolves #21731
git-svn-id: trunk@20945 -
2012-04-20 18:07:06 +00:00
Jonas Maebe
aee5380ae0 * merged trunk up to r20882
o support for the new codepage-aware ansistrings in the jvm branch
   o empty ansistrings are now always represented by a nil pointer rather than
     by an empty string, because an empty string also has a code page which
     can confuse code (although this will make ansistrings harder to use
     in Java code)
   o more string helpers code shared between the general and jvm rtl
   o support for indexbyte/word in the jvm rtl (warning: first parameter
     is an open array rather than an untyped parameter there, so
     indexchar(pcharvar^,10,0) will be equivalent to
     indexchar[pcharvar^],10,0) there, which is different from what is
     intended; changing it to an untyped parameter wouldn't help though)
   o default() support is not yet complete
   o calling fpcres is currently broken due to limitations in
     sysutils.executeprocess() regarding handling unix quoting and
     the compiler using the same command lines for scripts and directly
     calling external programs
   o compiling the Java compiler currently requires adding ALLOW_WARNINGS=1
     to the make command line

git-svn-id: branches/jvmbackend@20887 -
2012-04-15 15:54:10 +00:00
Jonas Maebe
ac43eb9b70 + generic implementation of ReplaceForbiddenAsmSymbolChars() instead
of the AVR-specific ifdef'ed variant
   o since the only special character we use in mangled names on all platforms
     is $, added a new field to tasminfo called "dollarsign" that holds the
     character $'s should be replaced with (if it doesn't have to be replaced,
     leave it at $)

git-svn-id: trunk@20801 -
2012-04-11 18:01:57 +00:00
florian
c5445399c6 * take care also of reg. allocation information after the current instruction when moving it
git-svn-id: trunk@20709 -
2012-04-05 14:21:41 +00:00
florian
9867f34398 * the arm rescheduler has not only to move instructions but also associated register allocations
git-svn-id: trunk@20707 -
2012-04-04 21:21:52 +00:00
florian
bb8be38607 - removed some no longer used constants
git-svn-id: trunk@20688 -
2012-04-01 20:49:34 +00:00
Jonas Maebe
2a8f624eb0 * fixed returning small but "non-simple" records on ARM platforms that use
the old APCS calling convention (such as iOS): they are returned by
    reference

git-svn-id: trunk@20665 -
2012-03-29 20:54:51 +00:00
Jonas Maebe
bba4b02eb2 * use r7 instead of r11 as frame pointer on Darwin/iOS, and make sure r7
always points to the previous r7 on the stack (with the saved return
    address coming right after it) so that the debugger and crashreporter
    can use it for backtraces as specified in the ABI
   o changed NR_FRAME_POINTER_REG and RS_FRAME_POINTER_REG from a symbolic
     into a typed constant, and added a new method to tprocinfo that can
     be used to initialze it (so it can be inited to r7/r11 depending on
     the target platform)
  * allow using r9 on Darwin, it was only used by the system on iOS up to
    2.x, which we no longer support
  * prefer using r9 and r12 before r4..r11 on Darwin, because they are
    volatile and hence do not have to be saved

git-svn-id: trunk@20661 -
2012-03-29 20:54:33 +00:00
Jonas Maebe
6ba8dc7146 + support for the ARM hard float EABI on Linux (patch by Peter Green):
o new eabihf (hard float) abi
   o vfpv3_d16 variant of VFP (default variant used by EABI assemblers: VFPv3
     with only 16 double registers instead of 32) and pass it to GNU as
   o make the odd numbered single precision floating point VFP registers
     available for explicit allocation for use by the calling convention
  * fixed copy/paste error in stdname of S30 register
  -> use -dFPC_ARMHF to create an ARM eabi hard float compiler
  (mantis #21554)

git-svn-id: trunk@20660 -
2012-03-29 20:50:09 +00:00
florian
0cbdc1ae6e * deactivate assembler scheduler, needs some more fixes first
git-svn-id: trunk@20537 -
2012-03-18 17:05:22 +00:00
florian
38d3a081f6 * update of TODOs
git-svn-id: trunk@20513 -
2012-03-11 20:12:46 +00:00
florian
0fe22a358b + first version of ldr instruction scheduler on arm
git-svn-id: trunk@20512 -
2012-03-11 19:10:58 +00:00
florian
e84a43768e * typo fixed
git-svn-id: trunk@20511 -
2012-03-11 08:24:44 +00:00
florian
2f5ce095ce * RefsHaveIndexReg -> cpurefshaveindexreg
* cpurefshaveindexreg defined properly in fpcdefs.inc

git-svn-id: trunk@20504 -
2012-03-10 19:43:52 +00:00
florian
7ea7031017 + cpu type armv5t
git-svn-id: trunk@20500 -
2012-03-10 19:04:22 +00:00
florian
9c6e3d317a * reenabled ldr/ldr and ldr/str optimization
git-svn-id: trunk@20497 -
2012-03-10 17:09:42 +00:00
florian
841d67ec81 * don't waste an extra register when copying 4 bytes
git-svn-id: trunk@20475 -
2012-03-05 19:12:00 +00:00