Commit Graph

653 Commits

Author SHA1 Message Date
florian
45383fd32d + a lot missing flag allocs/deallocs added
git-svn-id: trunk@22201 -
2012-08-23 08:54:52 +00:00
florian
2d2c66467c + optimize op ... / cmp .... when possible
git-svn-id: trunk@22200 -
2012-08-23 08:54:47 +00:00
florian
a92ca7c456 * adjust the reg. allocations of the target register in RemoveSuperfluousMove
git-svn-id: trunk@22194 -
2012-08-22 19:52:37 +00:00
florian
3d7b603d11 * get rid or move the allocation of the replaced register if possible
git-svn-id: trunk@22193 -
2012-08-22 19:52:30 +00:00
florian
77e579f59f * RemoveSuperfluousMove uses FindRegDeAlloc to find out if the register used in the move can be removed
* RemoveSuperfluousMove fixes partially the register allocation changes caused by the mov

git-svn-id: trunk@22192 -
2012-08-22 19:52:23 +00:00
florian
5fd457e586 * when determining of a register is used after an instruction, new allocs should not be taken into account
git-svn-id: trunk@22189 -
2012-08-22 19:52:03 +00:00
florian
c0425c48fd * make use of GetNextInstructionUsingReg
git-svn-id: trunk@22186 -
2012-08-22 19:51:40 +00:00
florian
f3f5be2af1 * RemoveSuperfluousMove should not mess with moves targetting lr or pc
git-svn-id: trunk@22185 -
2012-08-22 19:51:31 +00:00
florian
93eb20d407 + GetNextInstructionUsingReg
git-svn-id: trunk@22184 -
2012-08-22 19:51:19 +00:00
florian
55e6da6d28 * make cpubase for arm use inlining
git-svn-id: trunk@22183 -
2012-08-22 19:51:08 +00:00
florian
d8161c185c + track usage of flags by using a new register RS_/NR_DEFAULTFLAGS
git-svn-id: trunk@22179 -
2012-08-22 19:37:51 +00:00
florian
2a14394cf5 * cleaned up scheduler code, created own scheduler class to avoid unneeded passes through the assembler
git-svn-id: trunk@22133 -
2012-08-19 19:15:34 +00:00
florian
a3bf956c33 * improved main loop of TCpuPreRegallocScheduler.PeepHoleOptPass1Cpu
* reordered conditions in scheduler main loop so they abort potentially quicker

git-svn-id: trunk@22132 -
2012-08-19 19:13:49 +00:00
florian
765fb18679 + add a description to the cpuflags where I know the exact meaning/definition
git-svn-id: trunk@22119 -
2012-08-17 20:45:46 +00:00
florian
54e2b40ab4 * revert the parameter type change of the last commit, it was an overleft from a failed fix attempt
git-svn-id: trunk@22116 -
2012-08-17 19:36:37 +00:00
florian
ba6ba52e7f * instruction scheduling is pretty slow so make it a level 3 optimization for now
git-svn-id: trunk@22115 -
2012-08-17 19:36:29 +00:00
florian
45eafd3e65 * fix MovMov optimization if the second mov is a mov rX,rX
git-svn-id: trunk@22114 -
2012-08-17 19:36:22 +00:00
florian
4b4e08c28b * fixes copy&paste errors when moving end of live pointers
git-svn-id: trunk@22113 -
2012-08-17 19:36:16 +00:00
florian
53a0d3e3a3 * fixed typo when checking live start of references
git-svn-id: trunk@22112 -
2012-08-17 19:36:10 +00:00
florian
5ceeb8aaa9 * enable scheduler when compiling at least with -O2
git-svn-id: trunk@22111 -
2012-08-17 19:36:04 +00:00
florian
a693fe9fb7 + implemented TCpuPreRegallocScheduler.SwapRegLive and make use of it to be able to reschedule instructions before register allocation
git-svn-id: trunk@22110 -
2012-08-17 19:35:59 +00:00
florian
7e5b8584cf * set MaxOps to 4 for the optimizer because fpc generates now mla instructions
git-svn-id: trunk@22106 -
2012-08-17 12:38:59 +00:00
florian
354cac2bb6 + completed arm architectures
* ldrd/strd and pld collected under the edsp define

git-svn-id: trunk@22104 -
2012-08-17 10:37:27 +00:00
florian
7588896775 * make use of cpuflags in the arm compiler
* armv5te architecture

git-svn-id: trunk@22103 -
2012-08-17 10:37:17 +00:00
florian
e4f89fe524 + introduce cpuflags for arm
git-svn-id: trunk@22090 -
2012-08-15 15:49:05 +00:00
florian
ecb037ad79 + tarminnode.pass_1 to set expectloc correctly
git-svn-id: trunk@22074 -
2012-08-13 15:03:35 +00:00
florian
d2aa35e9de * throw an internal error if code generation depends on expectloc but expectloc and real loc do not match
git-svn-id: trunk@22073 -
2012-08-13 15:02:55 +00:00
florian
33f287d320 + tarminnode.in_smallset making use of tst
git-svn-id: trunk@22064 -
2012-08-11 22:10:45 +00:00
florian
19debd87cc * start with a qword aligned frame pointer to enable more ldrd/strd optimizations
git-svn-id: trunk@22061 -
2012-08-11 15:12:19 +00:00
florian
371ef7bada * cover more cases in AlignedToQWord
git-svn-id: trunk@22060 -
2012-08-11 15:11:43 +00:00
florian
db7e029574 * strd/ldrd optimization might be only done on dword operations
git-svn-id: trunk@22059 -
2012-08-11 15:11:10 +00:00
florian
8c45a909be + support ldr/ldr -> ldrd and str/str -> strd optimization where appliable
git-svn-id: trunk@22058 -
2012-08-11 11:45:54 +00:00
florian
4d86d25c6c * -O4 switch for optimizations which are correct but which might have unexpected effects
like field reordering (possible problems cracker classes) or using ebp as normal register (broken
      stack traces from dump_stack)
    + niln is also valid in a cse domain
    * parameters passed by reference shall have a complexity >1
    * load nodes from outer scopes shall have a complexity >1
    * better cse debugging
    + more node types added to cse
    * consider parameters passed by reference in cse
    * take care of cse in parameters in simple cases

git-svn-id: trunk@22050 -
2012-08-09 18:58:54 +00:00
tom_at_work
810adb2f65 Merge with trunk r22040. Regenerated makefiles.
git-svn-id: branches/targetandroid@22046 -
2012-08-09 08:12:34 +00:00
florian
b330bba0bc + introduce -Oofastmath
* limit the application of the tree transformation introduced in r21986 to safe cases and -Oofastmath

git-svn-id: trunk@22040 -
2012-08-08 19:35:45 +00:00
masta
aa21845cd9 Small optimization for OP_AND on ARM
Especially with 64bit operators the CG sometimes generates:
and r0, r1, #0
Which just clears r0 and is equivalent with
mov r0, #0

git-svn-id: trunk@22032 -
2012-08-08 06:44:20 +00:00
florian
7513291ad8 * generate different code for OS_S8 -> OS_16 conversion which might fold better, idea by Nico Erfurth
git-svn-id: trunk@22027 -
2012-08-07 19:36:46 +00:00
masta
6529307d9e Don't emit useless AND/BICs in ARM CG
In certain cases the CG would emit something like
bic r1, r0, #0
As BIC is clearing the specified bits this is equivalent to
mov r1, r0
This patch changes the CG to emit the mov instead which the register
allocator will hopefully remove most of the time.

git-svn-id: trunk@22024 -
2012-08-07 06:46:45 +00:00
masta
9e039936bf Support more operators in FoldShiftProcess on ARM
Now we can also fold shifts into teq, tst, cmp, cmn instructions.

git-svn-id: trunk@22023 -
2012-08-07 06:46:32 +00:00
florian
f619a1aaf6 * fld/fst can have a base register+offset
git-svn-id: trunk@22016 -
2012-08-05 18:34:13 +00:00
florian
e81ba0f82e + make use of the armv6+ sign/zero extension instructions if appropriate
git-svn-id: trunk@22013 -
2012-08-05 14:04:11 +00:00
florian
eb1efdff8a + introduce cstylearrayofconst because pocall_mwcall was forgotten at several places
git-svn-id: trunk@22012 -
2012-08-05 08:48:23 +00:00
florian
19ed835f2b * don't generate an extra indirection when loading vfp constants
git-svn-id: trunk@22010 -
2012-08-04 17:01:57 +00:00
masta
8a684c1f10 Don't generate IT instruction in second_cmp64bit for Thumb-2
Currently the register spiller can not handle the "bond" between IT* and
a following instruction, sometimes breaking them apart, which breaks the
build or worse the result.

So for now we're not emitting A_IT* in second_cmp64bit anymore but use a
conditional jump instead.

This fixes Mantis #22520

git-svn-id: trunk@22009 -
2012-08-04 16:55:58 +00:00
masta
1c51b8d906 Disable 64bit shifts for thumb2 - Fix for Mantis #22520
In r21686 I've introduced optimized 64bit shifts for ARM. But the
methods did not check for which machine it has to generate the code.

This patch disables the optimized code for now if the target is in
cpu_thumb2 and falls back to the generic code.

There are 2 problems with the current code:

1.) Thumb-2 does not support shift by register on all data instruction
as ARM does.
2.) The code does not generate the required IT-block for the conditional
executed code.

git-svn-id: trunk@21997 -
2012-08-02 00:56:21 +00:00
masta
c16871e129 Generate better code in Tthumb2cgarm.g_flags2reg
The old code generated a strange IT-sequence:

IT EQ
MOVEQ r0, #1
IT NE
MOVNE r0, #1

Now we generate:

ITE EQ
MOVEQ r0, #1
MOVNE r0, #1

IT stands for IfThen, ITE for IfThenElse it has a couple of other forms
where the instruction gets extended to handle more of the following
instructions. So we have ITEE, ITETE etc, up to 4 instructions can be
handled.

git-svn-id: trunk@21996 -
2012-08-02 00:56:15 +00:00
florian
023d632f44 * optimize also lsr/asr, lsl, lsr/asr sequences on arm
git-svn-id: trunk@21981 -
2012-07-28 22:30:11 +00:00
florian
283afbcb07 * new controllers by lelekx, resolves #22523
git-svn-id: trunk@21980 -
2012-07-28 21:57:29 +00:00
florian
c5ad1bce7b * avoid uncessary zero extensions in case code
git-svn-id: trunk@21979 -
2012-07-28 20:09:21 +00:00
florian
c8435b503f * better folding of consecutive shift operations
git-svn-id: trunk@21978 -
2012-07-28 17:59:45 +00:00
florian
614afc1c8f * pass march to GNU AS for cpu_armv6 and cpu_armv7
git-svn-id: trunk@21958 -
2012-07-23 20:20:17 +00:00
Jonas Maebe
0a1157da38 * fixed memory leaks in the compiler introduced in r21862 by marking and
releasing temporarily created function result locations

git-svn-id: trunk@21953 -
2012-07-23 13:49:29 +00:00
florian
d5aa89449e * generate less register wasting code for 64 bit comparions
git-svn-id: trunk@21950 -
2012-07-22 21:07:33 +00:00
masta
be6bf6e3f7 Fix possible access violation introduces in r21885
r21885 added a new peephole optimizer. The associated code refactoring
missed a check for

  tai(hp1).typ = tai_instruction

Which can lead to an access violation later on, because the rest of the
code expects to find a taicpu in hp1.

git-svn-id: trunk@21949 -
2012-07-22 18:06:08 +00:00
Jonas Maebe
3798b79fd7 + optimization that (re)orders instance fields of Delphi-style classes in
order to minimise memory losses due to alignment padding. Not yet enabled
    by default at any optimization level, but can be (de)activated separately
    via -Oo(no)orderfields
   o added separate tdef.structalignment method that returns the alignment
     of a type when it appears in a record/object/class (factors out
     AIX-specific double alignment in structs)
   o changed the handling of the offset of a delegate interface
     implemented via a field, by taking the field offset on demand
     rather than at declaration time (because the ordering optimization
     causes the offsets of fields to be unknown until the entire
     declaration has been parsed)

git-svn-id: trunk@21947 -
2012-07-22 16:47:19 +00:00
masta
aa4fe66153 Fix ARM ASM-reader for MVN/CMP/CMN/TST/TEQ
Like MOV these instructions support 2 operands, with the second beeing a
shifterop.

Without this patch the asm reader would fail on something like

cmp r0, r1, lsr 16

with

Error: Unknown identifier "LSR"

git-svn-id: trunk@21911 -
2012-07-15 01:03:08 +00:00
masta
e2a744e19b Consolidate do_spill_read/do_spill_written on arm
ARM can not reference an arbitrary offset so it needs some special
handling if the offset goes beyond abs(4095).

The code for do_spill_read and do_spill written used to be very similar.
I've partially factored out the code into spilling_create_load_store.

The former code loaded the offset from a constant pool, which is a waste
of memory-bandwidth and cache lines. The new code tries to find a way to
adjust the baseregister so the memory location can be reached more
easily, this allows us to handle at least +-1MB with just a single
additional ADD or SUB instruction. If that fails we'll resort to the normal
constant loading code, which on it's own will fallback to loading the
constant from a constant-pool.

So instead of:
ldr r1, =16388
ldr r0, [r13, r1]

which will at least uses 4 cycles (2 Instruction cycles + 2 stall
cycles) on most cores.

We try to generate:
add r1, r13, #16384
ldr r0, [r1, #4]

which most armv5+ cores will execute in 2 cycles. We'll also save on
DCache usage.

git-svn-id: trunk@21889 -
2012-07-12 01:11:23 +00:00
florian
701a5d76bb * remove unneeded movs
git-svn-id: trunk@21885 -
2012-07-11 20:58:52 +00:00
masta
57b67dfa30 Better SP adjustments on entry/exit for ARM
If the needed adjustment is not expressible in a shifterconst, the old code
loaded a temporary register (fixed to r12) via a_load_const_reg and used it
to adjust the SP. Resulting in:

mov r12, #44
orr r12, r12, #4096
sub sp, sp, r12

The new code will try to split the adjustment into 2 shifterconstants and
will do two seperate adjustments:

sub sp, sp, #44
sub sp, sp, #4096

If that doesn't work we'll fall back to the old code. But that should
happen VERY rarely, only for stacks bigger than 256k which are not
expressible in 2 shifter constants.

git-svn-id: trunk@21863 -
2012-07-11 08:41:45 +00:00
florian
95732625cc * use r11 as a normal register if no frame pointer is needed
git-svn-id: trunk@21834 -
2012-07-09 17:17:23 +00:00
tom_at_work
4150f0a2fb Rebase with r21814
git-svn-id: branches/targetandroid@21815 -
2012-07-07 23:09:20 +00:00
masta
dbf0404fb0 More consolidation of OP_SHL/SHR/ROR/SAR in ARM CodeGen
This removes the duplications in a_op_reg_reg_reg_checkoverflow.
OP_ROL stays seperate because it needs some special treatment again.

The code for OP_ROL was changed, previously it generated:
mov tempreg, #32
sub src1, tempreg, src1
mov dst, src2, ror src1

This would trash src1, which MIGHT be a problem, but i'm not totally
sure. But the mov/sub was replaced with rsb, so the new code looks like
this.

rsb tempreg, src1, #32
mov dst, src2, ror tempreg

If src1 gets freed afterwards the regallocator should be able to change
that into:

rsb src1, src1, #32
mov dst, src2, ror src1

git-svn-id: trunk@21804 -
2012-07-06 15:01:31 +00:00
masta
d2d5d17557 Consolidate handling of OP_SHL/SHR/ROL/ROR/SAR in ARM CodeGen
The previous code was full with duplicated code, this new version just
maps the OP_* to the correct SM_* and does some special handling for
OP_ROL which is done via OP_ROR.

git-svn-id: trunk@21801 -
2012-07-06 12:10:42 +00:00
masta
504a0ce0ca Fix for Mantis #22326
This fixes 64bit shifts on arm with a constant shift value of 0.

The old code would have emitted something like this
mov r0, r0, lsl #32
as 32 is an invalid shift value (and would be wrong anyway) the
assembler declined to assemble the produced source.

The new code will just not emit any code for a shift value of 0.

tests/test/tint642.pp now tests shl/shr 0 on 64 bit values.
tests/webtbs/tw22326.pp is also added as an additional test.

git-svn-id: trunk@21746 -
2012-07-01 08:09:00 +00:00
Jonas Maebe
7a0ae38700 + also specify the parameter def when allocating a parameter via
getintparaloc + adapted all call sites of getintparaloc. This
    led to a number of additional, related changes:
   o corrected the type information for some getintparaloc parameters
   o don't allocate some intparalocs in cases they aren't used
   o changed "const tvardata" parameter into "constref tvardata" for
     fpc_variant_copy_overwrite to make pass-by-reference semantics
     explicit
   o moved a number of routines that now have to call find_system_type()
     from cgobj to hlcgobj so that cgobj doesn't have to start depending
     on the symtable unit
   o added versions of the cpureg alloc/dealloc methods to hlcgobj that
     call through to their cgobj counter parts, so we can call save/restore
     the cpu registers before/after calling system helpers from hlcgobj
     (not implemented in hlcgobj itself, because all basic register
      allocator functionality is still part of cgobj/cgcpu)

git-svn-id: trunk@21696 -
2012-06-24 15:02:12 +00:00
Jonas Maebe
c3ea451aea * set tcgpara.vardef when creating parameter info
git-svn-id: trunk@21693 -
2012-06-24 15:01:54 +00:00
Jonas Maebe
2d48396587 - removed redundant checks
git-svn-id: trunk@21692 -
2012-06-24 15:01:48 +00:00
Jonas Maebe
587244c088 * factored out common code from get_funcretloc()
* set tcgpara.def for the function return location (field introduced for and
    already used by the JVM code generator, required for future hlcg
    functionality)

git-svn-id: trunk@21691 -
2012-06-24 15:01:42 +00:00
masta
ca70207bc0 Support 64-bit shifts on ARM.
This code generate different versions of assembly depending on the
amount to shift.

Variable Amount: 6 cycles (5 if last shift can be folded)
Constant 1     : 2 cycles
Constant 2-31  : 3 cycles (2 if last shift can foldable)
Constant 32    : 1 cycle  (depends on the register allocator)
Constant 33-64 : 2 cycles

This should speed up softfpu on arm a bit.

git-svn-id: trunk@21686 -
2012-06-23 20:36:27 +00:00
masta
3566956389 Fix ARM-Assembler output for RRX-Shifterops
RRX (Rotate Right with eXtend) does a single bit right rotation through
the carry. So it does not take any arguments, neither constant nor
register.

Also remove redundant shiftmode2str and replace usage of it with gas_shiftmode2str.

git-svn-id: trunk@21685 -
2012-06-23 20:36:16 +00:00
masta
59c726c829 Support ABS intrinsic on ARM
This code will generate the following sequence on arm:
r1=dst
r0=src

movs r1, r0
rsbmi r1, r0, #0

movs will set the N-flag when the MSB of r0 is set, if it is set, rsb
will calculate dst:=0-src;

git-svn-id: trunk@21678 -
2012-06-21 20:12:36 +00:00
masta
aeb15ba2b6 Fixed postfix check in taicpu.is_same_reg_move
The old version did not check the S-Postfix for MOV, which results in
removing instructions like:

movs r0, r0

which breaks later flag usage.

git-svn-id: trunk@21676 -
2012-06-21 20:12:25 +00:00
masta
2768e0fc12 Folded Add/Sub/Or Splitter, lots of debug output
git-svn-id: trunk@21660 -
2012-06-20 12:39:28 +00:00
masta
5498456269 Add LsrAndLsr Peephole Optimizer for ARM
Remove the superfluous and in:
mov r0, r0, lsr #24
and r0, r0, #255

Doing this allows for better shift-folding later

git-svn-id: trunk@21659 -
2012-06-20 12:39:19 +00:00
masta
92c47148cc Optimize 8/16 OP_NOT on ARM
This now generates:

mvn r0, r0, lsl #24/#16
mov r0, r0, lsr/asr #24/#16

The lsr/asr might be folded into a following instruction, making the
whole operation 1 cycle instead of 2-3 with the previous solution.

git-svn-id: trunk@21658 -
2012-06-20 12:39:09 +00:00
masta
0f3441a9c2 Split OP_ADD, OP_SUB, OP_AND and OP_ORR into multiple instructions if that can avoid constant construction or even loading from a pool.
OP_ADD, OP_SUB, OP_ORR will be split into two intructions if possible when a load/const
construction is required.

OP_AND is a bit different, because we can't just split it up, but we try
to find a two instruction BIC-equivalent to it.

Till now code like

a:= a and $FFFF;

produced code like

mov r0, $FF00
orr r0, r0, $FF
and r1, r1, r0

With this addition we produce code like:

bic r0, r0, $FF00
bic r0, r0, $FF

Saving us at least a cycle and in some cases also a load from the
constant-pool.

This uses the new split_into_shifter_const function.

git-svn-id: trunk@21647 -
2012-06-18 16:59:29 +00:00
masta
f11fbe527e Improve loading of ARM constant values
*  use split_into_shifter_const to reduce the MOV/ORR combination to a
   single check and allow a broader rang of combinations.
*  Introduce MVN/BIC combination to load values which have more 1 than 0
   bits set (like small negative values)

git-svn-id: trunk@21646 -
2012-06-18 16:59:24 +00:00
masta
d987cee96a Introduce split_into_shifter_const to ARM-Code Generator
This functions tries to split up a 32-bit value into two shifter
constants. This approach finds a broader range for two shifter constant
combinations.

git-svn-id: trunk@21645 -
2012-06-18 16:59:19 +00:00
masta
3205169ab9 Use roldword intrinsic instead of function rotl.
These days we don't need the hand coded rol anymore.

git-svn-id: trunk@21644 -
2012-06-18 16:59:13 +00:00
Jonas Maebe
0fc422f244 * moved definition of maxcpuregister and tcpuregisterset from cgbase to
cgutils, and define them so they are no larger than what is required by
    the current target platform
  * added cgutils to the uses clause of several units that use the
    tcpuregisterset type

git-svn-id: trunk@21624 -
2012-06-15 18:24:35 +00:00
Jonas Maebe
708a2532fc * consistently define empty saved_mm_registers arrays as containing a single
RS_INVALID superregister (instead of sometimes RS_NO and sometimes
    RS_INVALID)
  * check for RS_INVALID in tcg.g_save_registers() and ignore such entries

git-svn-id: trunk@21622 -
2012-06-15 18:24:25 +00:00
florian
64ac48c815 * patch by Nico Erfurth: Better support for PLD on ARM
git-svn-id: trunk@21572 -
2012-06-09 17:28:05 +00:00
florian
3db61ae52d * patch by Nico Erfurth: Reworked regLoadedWithNewValue
Added better support for A_STR, A_LDR, A_STM, A_LDM.

Reworked the code the use a case statement for better readability.

git-svn-id: trunk@21571 -
2012-06-09 17:27:30 +00:00
florian
03a30ff036 * patch by Nico Erfurth: Remove STRH and STRB from instructionLoadsFromReg
STRH and STRB are not handled as sperate instructions by the code
generator.

git-svn-id: trunk@21570 -
2012-06-09 17:26:06 +00:00
florian
7599de416d * patch by Nico Erfurth: Reworked MatchOperand in ARM Peephole Optimizers
Added top_ref comperator which uses RefsEqual.
Reworked the code for easier readability by using a case statement.

git-svn-id: trunk@21569 -
2012-06-09 17:25:32 +00:00
florian
6e8594a9af * patch by Nico Erfurth: Minor fix for FoldShiftProcess peephole optimizer on ARM
Use UpdateUsedRegs and drop the check for reloading of the register, as
this is done in RegUsedAfterInstruction now.

git-svn-id: trunk@21520 -
2012-06-07 18:21:46 +00:00
florian
5b02a7cb9b * patch by Nico Erfurth: Check for register reloading in RegUsedAfterInstruction on ARM
This slightly changes the semantics of RegUsedAfterInstruction.
We now check if the `current value` of the register will be used later.
It will do `the right thing` for all the normal use cases.

git-svn-id: trunk@21519 -
2012-06-07 18:20:35 +00:00
florian
45c70ec81c * patch by Nico Erfurth: Support the usage of BIC instead of AND on ARM
BIC clears the specified bits, while AND keeps them. The usage of BIC
allows a broader range of shifterconsts to be used on the ARM cpu, often
saving a cycle.

Previously code like:
Data:=Data and $FFFFFF00

would result in

mvn r1, #255
and r0, r0, r1

This patch changes this to

bic r0, r0, #255

git-svn-id: trunk@21510 -
2012-06-06 19:45:26 +00:00
florian
fefc130efc * patch by Nico Erfurth: Handle BIC properly in taicpu.spilling_get_operation_type
BIC was handled as a read only operation, which caused it to overwrite
live register content sometimes.

git-svn-id: trunk@21509 -
2012-06-06 19:44:53 +00:00
florian
8cae4c9f23 * patch by Nico Erfurth: Fix for MovStrMov Peephole optimizer on ARM
The loop checked for the wrong instruction for .opcode = A_STR. Making
the whole optimizer non functional but at least not destructive.

git-svn-id: trunk@21508 -
2012-06-06 19:44:20 +00:00
florian
83fb4c289d * patch by Nico Erfurth: Implement FoldShiftProcess Peephole optimizer for ARM
This optimizer folds shift/roll operations into following data
instructions.

It will change code like:

mov r0, r0, lsl #16
add r1, r0, r1

into

add r1, r1, r0, lsl #16

Source registers will be reordered when necessary, also SUB/SBC will be
replaced with RSB/RSC and vice versa when reordering is required.

It could be expanded to support more operations like LDR/STR.

git-svn-id: trunk@21507 -
2012-06-06 19:43:36 +00:00
florian
5393efb128 * patch by Nico Erfurth: Support A_MOV and A_MVN in RedundantMovProcess
This changes the ARM Peephole optimizer RedundantMovProcess to also
recognize and modify something like the following sequence.

mov r0, r1
mov r0, r0, lsl #8

this would be changed into

mov r0, r1, lsl #8

git-svn-id: trunk@21506 -
2012-06-06 19:43:05 +00:00
florian
4ea1d22c5a * patch by Nico Erfruth: Support BX for function returns on armv5+
BX is supported from ARMv4T onwards, but i don't have a armv4t device to
test it.

Using BX instead of mov pc,lr allows for a better pipeline utilization
by enabling the CPUs branch predictor to work properly.

git-svn-id: trunk@21505 -
2012-06-06 19:42:26 +00:00
florian
3ae5fc8c04 * patch by Nico Erfurth: adds a check for SM_ASR to also support removal of unnecessary sign extension before STRH.
git-svn-id: trunk@21446 -
2012-05-31 20:24:48 +00:00
florian
4f273aa08d * patch by Nico Erfurth: Handle STR*/LDR* properly in ARM Peephole optimizers
git-svn-id: trunk@21444 -
2012-05-31 17:00:19 +00:00
florian
fbc77b74c2 * patch by Nico Erfurth to remove superfluouse moves
git-svn-id: trunk@21422 -
2012-05-28 21:58:06 +00:00
florian
c348b6f2cc * patch by Nico Erfurth:
- Support MLA and MUL in DataMov2Data
- SMLAL and UMLAL are also reading from oper[0]
- UMLAL, UMULL, SMLAL and SMULL are writing to oper[1]

git-svn-id: trunk@21421 -
2012-05-28 18:11:31 +00:00
florian
9e180fb318 * remove unneeded zero extensions from 16 to 32 Bit
git-svn-id: trunk@21404 -
2012-05-28 07:21:27 +00:00
florian
21b94f675f + add for MLA the same register interferences as for MUL
* register interferences for MUL/MLA are only needed for less than ARMv6

git-svn-id: trunk@21385 -
2012-05-24 19:14:58 +00:00
florian
638d0d49c0 + take advantage of the mla instruction when calculating array offsets
git-svn-id: trunk@21375 -
2012-05-23 20:48:26 +00:00
florian
c75486db89 * patch by Nico Erfurth:
Reorder unaligned Load sequence on ARM

The old version produced code like that:

ldrb rDEST, [rBASE]
ldrb rTemp, [rBASE, #1]
orr  rDEST, rDEST, rTEMP lsl #8 (2 stall cycles)
ldrb rTemp, [rBASE, #2]
orr  rDEST, rDEST, rTEMP lsl #16 (2 stall cycles)
ldrb rTemp, [rBASE, #3]
orr  rDEST, rDEST, rTEMP lsl #24 (2 stall cycles)

This creates a lot of stall-cycles on ARM Implementations with load
delay slots like Marvel Kirkwood or Intel XScale. With the usual up to 2
stall-cycles this code requires a total of 13 cycles (7 instructions + 6 stall
cycles) in best case.

The new code uses a second temp register to avoid the stall cycles.

ldrb rDEST, [rBASE]
ldrb rTemp1, [rBASE, #1]
ldrb rTemp2, [rBASE, #2]
orr  rDEST, rDEST, rTEMP1 lsl #8
ldrb rTemp1, [rBASE, #3]
orr  rDEST, rDEST, rTEMP2 lsl #16
orr  rDEST, rDEST, rTEMP1 lsl #24 (1 stall cycle)

The rescheduling and second register bring the total cycles down to 8.
If a later rescheduling should happen for the last orr it even can go
down to 7.

git-svn-id: trunk@21363 -
2012-05-22 19:09:20 +00:00
florian
5f0bcd9248 * patch by Nico Erfurth:
Optimize ARM OP_MUL/OP_IMUL for x*ispowerof2(const+1) cases

Calculations like a*7 can be optimized to a*8-a with the usage of RSB and left
shifts which can be done in a single cycle.

git-svn-id: trunk@21351 -
2012-05-20 20:50:04 +00:00
florian
05a8783b1e * patch by Nico Erfurth:
Improve ARM-Peephole Optimizers

1.) Introduce a ARM-specific RegUsedAfterInstruction which analyzes
instructions and reg allocation information to see if a register is
really needed afterwards to decide if some special optimizations can be
done.

2.) Introduce "RemoveSuperfluousMove"
This tries to fold mov into a previous Data-Instruction (ADD, ORR, etc)
or LDR-Instruction.

3.) Introduce new Optimizer "DataMov2Data" and modify LdrMov2Ldr to use
RemoveSuperfluousMove

4.) Expand Ldr* and Str* Optimizers to also work on {Ldr,Str}{,b,h}

git-svn-id: trunk@21314 -
2012-05-17 08:31:44 +00:00
florian
798c9340cc * patch by Nico Erfurth:
Inline a couple of small functions of the ARM-Compiler

These small changes improved overall compile times of the fpc suite by
about 2-3% running on an 1.2GHz Kirkwood.

git-svn-id: trunk@21312 -
2012-05-17 08:03:51 +00:00
florian
b2813abec2 + patch by Bernd to add the push/pop mnemonic for arm/thumb-2, resolves #22041
git-svn-id: trunk@21310 -
2012-05-15 18:52:09 +00:00
florian
2560266e5d * skip comments properly when searching for places for constant pool distances
git-svn-id: trunk@21307 -
2012-05-15 18:08:19 +00:00
florian
748694a325 * fixes some issues with reg. allocation information
git-svn-id: trunk@21303 -
2012-05-15 18:06:41 +00:00
Jonas Maebe
edd42aa42a * moved subsetref/reg and bit_set/test support from cgobj to hlcgobj for
future use by high level code generator targets
   o this in turn required that all a_load*_loc* methods are called via
     hlcg rather than via cg, since a location can be a subsetref/reg and
     and those are no longer handled in tcg
   o that then required moving several force_location_* routines into
     thlcg because they use a_load_loc*, but did not take tdef size
     parameters (which are required by the thlcg a_load_loc* routines)
   o the only practical consequence is that from now on, you have to
     use hlcg.location_force_mem/reg() (fpureg not yet) and
     hlcg.gen_load_loc_cgpara() instead of the removed versions from ncgutil,
     and hlcg.a_load*loc*() instead of cg.a_load*loc* if a subsetref/reg
     might be involved

git-svn-id: trunk@21287 -
2012-05-13 12:33:10 +00:00
Jonas Maebe
ef2d665a50 + support for REV and several other ARMv6/ARMv6T2+ opcodes (mantis #21888)
git-svn-id: trunk@21285 -
2012-05-13 12:14:26 +00:00
Jonas Maebe
85a3fd3357 + ossinttype/osuinttype defs that correspond to OS_SINT/OS_INT for use in
the high level code generator

git-svn-id: trunk@21279 -
2012-05-12 16:03:15 +00:00
florian
77ae218556 * safer calculation of pool placement on arm
git-svn-id: trunk@21226 -
2012-05-04 19:10:30 +00:00
Jonas Maebe
834026bfb5 * synchronised with trunk up to r21067
git-svn-id: branches/jvmbackend@21068 -
2012-04-26 21:24:20 +00:00
florian
2959d596f9 * patch by Nico Erfurth: Remove superfluous mov from MovStrMov sequences
git-svn-id: trunk@21067 -
2012-04-26 20:31:13 +00:00
tom_at_work
acbc94e0fd - initial support for the android/arm target in the compiler; resulting .so's can be used for Android/ARM app development.
- basic rtl support using system calls
- fp(c)make/fppkg/makefile support

todo:
- revisit systems/t_android.pas: mostly duplicate with t_linux.pas, containing
lots of unnecessary code
- revisit rtl changes
- android ndk header translation import
- better app build/packaging support
- android/x86 support

git-svn-id: branches/targetandroid@21061 -
2012-04-26 09:36:42 +00:00
florian
aa2a9dbf2e patches by Nico Erfurth to improve the arm peephole optimizer:
* Introduce MatchInstruction and MatchOperand

MatchInstruction allows to match an instruction by condition and
oppostfix. MatchOperand checks if an operand is a register and matches
another operand. In the future this could be overloaded with other
versions not only accepting TRegister.

* Optimize cmp,moveq,movne sequence on ARM

This patch implements an peephole optimizer for the following sequence:

  cmp   reg,const1
  movne reg,const2
  moveq reg,const1

* Small improvements to the ARM peephole optimizer

Most instructions in the ARM ISA have taicpu(p).oper[0]^.typ = top_reg
as the only option, so there is no need to check for it if we're
looking at those instructions.

* Remove redundant mov instructions on ARM

This is an addition to the ARM PeepHole Optimizer.
It folds code like this:

mov reg1, reg2
add reg1, reg1, (const|reg)

git-svn-id: trunk@21024 -
2012-04-24 18:25:19 +00:00
florian
532102d3fa * use correct result registers for in64 results on armbe, resolves #21731
git-svn-id: trunk@20945 -
2012-04-20 18:07:06 +00:00
Jonas Maebe
aee5380ae0 * merged trunk up to r20882
o support for the new codepage-aware ansistrings in the jvm branch
   o empty ansistrings are now always represented by a nil pointer rather than
     by an empty string, because an empty string also has a code page which
     can confuse code (although this will make ansistrings harder to use
     in Java code)
   o more string helpers code shared between the general and jvm rtl
   o support for indexbyte/word in the jvm rtl (warning: first parameter
     is an open array rather than an untyped parameter there, so
     indexchar(pcharvar^,10,0) will be equivalent to
     indexchar[pcharvar^],10,0) there, which is different from what is
     intended; changing it to an untyped parameter wouldn't help though)
   o default() support is not yet complete
   o calling fpcres is currently broken due to limitations in
     sysutils.executeprocess() regarding handling unix quoting and
     the compiler using the same command lines for scripts and directly
     calling external programs
   o compiling the Java compiler currently requires adding ALLOW_WARNINGS=1
     to the make command line

git-svn-id: branches/jvmbackend@20887 -
2012-04-15 15:54:10 +00:00
Jonas Maebe
ac43eb9b70 + generic implementation of ReplaceForbiddenAsmSymbolChars() instead
of the AVR-specific ifdef'ed variant
   o since the only special character we use in mangled names on all platforms
     is $, added a new field to tasminfo called "dollarsign" that holds the
     character $'s should be replaced with (if it doesn't have to be replaced,
     leave it at $)

git-svn-id: trunk@20801 -
2012-04-11 18:01:57 +00:00
florian
c5445399c6 * take care also of reg. allocation information after the current instruction when moving it
git-svn-id: trunk@20709 -
2012-04-05 14:21:41 +00:00
florian
9867f34398 * the arm rescheduler has not only to move instructions but also associated register allocations
git-svn-id: trunk@20707 -
2012-04-04 21:21:52 +00:00
florian
bb8be38607 - removed some no longer used constants
git-svn-id: trunk@20688 -
2012-04-01 20:49:34 +00:00
Jonas Maebe
2a8f624eb0 * fixed returning small but "non-simple" records on ARM platforms that use
the old APCS calling convention (such as iOS): they are returned by
    reference

git-svn-id: trunk@20665 -
2012-03-29 20:54:51 +00:00
Jonas Maebe
bba4b02eb2 * use r7 instead of r11 as frame pointer on Darwin/iOS, and make sure r7
always points to the previous r7 on the stack (with the saved return
    address coming right after it) so that the debugger and crashreporter
    can use it for backtraces as specified in the ABI
   o changed NR_FRAME_POINTER_REG and RS_FRAME_POINTER_REG from a symbolic
     into a typed constant, and added a new method to tprocinfo that can
     be used to initialze it (so it can be inited to r7/r11 depending on
     the target platform)
  * allow using r9 on Darwin, it was only used by the system on iOS up to
    2.x, which we no longer support
  * prefer using r9 and r12 before r4..r11 on Darwin, because they are
    volatile and hence do not have to be saved

git-svn-id: trunk@20661 -
2012-03-29 20:54:33 +00:00
Jonas Maebe
6ba8dc7146 + support for the ARM hard float EABI on Linux (patch by Peter Green):
o new eabihf (hard float) abi
   o vfpv3_d16 variant of VFP (default variant used by EABI assemblers: VFPv3
     with only 16 double registers instead of 32) and pass it to GNU as
   o make the odd numbered single precision floating point VFP registers
     available for explicit allocation for use by the calling convention
  * fixed copy/paste error in stdname of S30 register
  -> use -dFPC_ARMHF to create an ARM eabi hard float compiler
  (mantis #21554)

git-svn-id: trunk@20660 -
2012-03-29 20:50:09 +00:00
florian
0cbdc1ae6e * deactivate assembler scheduler, needs some more fixes first
git-svn-id: trunk@20537 -
2012-03-18 17:05:22 +00:00
florian
38d3a081f6 * update of TODOs
git-svn-id: trunk@20513 -
2012-03-11 20:12:46 +00:00
florian
0fe22a358b + first version of ldr instruction scheduler on arm
git-svn-id: trunk@20512 -
2012-03-11 19:10:58 +00:00
florian
e84a43768e * typo fixed
git-svn-id: trunk@20511 -
2012-03-11 08:24:44 +00:00
florian
2f5ce095ce * RefsHaveIndexReg -> cpurefshaveindexreg
* cpurefshaveindexreg defined properly in fpcdefs.inc

git-svn-id: trunk@20504 -
2012-03-10 19:43:52 +00:00
florian
7ea7031017 + cpu type armv5t
git-svn-id: trunk@20500 -
2012-03-10 19:04:22 +00:00
florian
9c6e3d317a * reenabled ldr/ldr and ldr/str optimization
git-svn-id: trunk@20497 -
2012-03-10 17:09:42 +00:00
florian
841d67ec81 * don't waste an extra register when copying 4 bytes
git-svn-id: trunk@20475 -
2012-03-05 19:12:00 +00:00
florian
b4907578b0 * temporarily disable LDR/LDR STR/LDR optimizations, let's see if this broke regression testing on fpcarm
git-svn-id: trunk@20473 -
2012-03-04 20:37:06 +00:00
florian
fdfb9a3fba * take care of conditions when doing ldr/str optimizations
git-svn-id: trunk@20428 -
2012-02-25 21:04:28 +00:00
florian
bb2df48aa9 - <op> ....; cmp ...,#0 cmps ... optimization deactivated
* optimize ldr/ldr if possible

git-svn-id: trunk@20416 -
2012-02-23 21:29:22 +00:00
florian
e2c9a8c6a1 * fold <arithmed. op> ...; cmp ...,#0into cmps on arm
* remove unnecessary ldr after str to the same memoy location, however, to do this optimization safely, we should add support for volatile variables

git-svn-id: trunk@20399 -
2012-02-22 20:16:06 +00:00
florian
ce070c93fc + patch by Jeppe Johansen to support the SC32442B
git-svn-id: trunk@20081 -
2012-01-14 21:39:32 +00:00
florian
862f9dacea * handle int_to_bool for qwordbools correctly on arm
git-svn-id: trunk@19933 -
2011-12-31 14:14:21 +00:00
pierre
42c98f3cd5 Override abstract method to abvoid warning at compilation time
git-svn-id: trunk@19578 -
2011-11-03 10:08:12 +00:00
pierre
fc9dd61f03 Avoid warning about missing fields in embedded_controllers array
git-svn-id: trunk@19577 -
2011-11-03 10:07:35 +00:00
pierre
3a7af29d3a Use aint type local variable in read_index_shift to avoid wrong typecast
git-svn-id: trunk@19576 -
2011-11-03 10:06:36 +00:00
florian
b93f4b8096 * whitespace fixes
* implicitly add PC as base register for symbols

git-svn-id: trunk@19274 -
2011-09-28 20:36:44 +00:00
florian
ce61891ca3 * offset used by A_LDF,A_STF,A_FLDS,A_FLDD,A_FSTS,A_FSTD must be dividable by 4
git-svn-id: trunk@19270 -
2011-09-28 19:01:09 +00:00
Jonas Maebe
2b11fd2bef * cpus that only understand Thumb-2 don't support "blx <imm>"
git-svn-id: trunk@19238 -
2011-09-25 20:44:39 +00:00
florian
5fa184c952 + patch by Jeppe Johansen to make use of the div/udiv instruction on arm7m, resolves #20022
* explicitly make symbol addressing PC relative

git-svn-id: trunk@19221 -
2011-09-24 21:41:01 +00:00
Jonas Maebe
852ae48cb7 * also use blx instead of bl for direct calls on ARMv5+, since the target
may be thumb(2) (mantis #19896)
  * don't conditionalize "blx <imm target>", because that's not a valid
    encoding

git-svn-id: trunk@18984 -
2011-09-05 20:33:15 +00:00
florian
0781ac1f82 + support for lpc1768 by David Welch
git-svn-id: trunk@18927 -
2011-08-31 20:17:23 +00:00
florian
34b033ba72 + armv4t
* use armv4t and armv7m, in the makefiles instead of armv7 and cortexm3

git-svn-id: trunk@18863 -
2011-08-27 20:21:42 +00:00
florian
c95f7b1c2f * remove cpu type cortex m3 on arm, it is just an ARMv7-M
git-svn-id: trunk@18862 -
2011-08-27 19:26:16 +00:00
florian
42c94d1b91 * controllerunit.inc is no longer used
git-svn-id: trunk@18852 -
2011-08-26 07:22:09 +00:00