Commit Graph

541 Commits

Author SHA1 Message Date
florian
55e6da6d28 * make cpubase for arm use inlining
git-svn-id: trunk@22183 -
2012-08-22 19:51:08 +00:00
florian
d8161c185c + track usage of flags by using a new register RS_/NR_DEFAULTFLAGS
git-svn-id: trunk@22179 -
2012-08-22 19:37:51 +00:00
florian
2a14394cf5 * cleaned up scheduler code, created own scheduler class to avoid unneeded passes through the assembler
git-svn-id: trunk@22133 -
2012-08-19 19:15:34 +00:00
florian
a3bf956c33 * improved main loop of TCpuPreRegallocScheduler.PeepHoleOptPass1Cpu
* reordered conditions in scheduler main loop so they abort potentially quicker

git-svn-id: trunk@22132 -
2012-08-19 19:13:49 +00:00
florian
765fb18679 + add a description to the cpuflags where I know the exact meaning/definition
git-svn-id: trunk@22119 -
2012-08-17 20:45:46 +00:00
florian
54e2b40ab4 * revert the parameter type change of the last commit, it was an overleft from a failed fix attempt
git-svn-id: trunk@22116 -
2012-08-17 19:36:37 +00:00
florian
ba6ba52e7f * instruction scheduling is pretty slow so make it a level 3 optimization for now
git-svn-id: trunk@22115 -
2012-08-17 19:36:29 +00:00
florian
45eafd3e65 * fix MovMov optimization if the second mov is a mov rX,rX
git-svn-id: trunk@22114 -
2012-08-17 19:36:22 +00:00
florian
4b4e08c28b * fixes copy&paste errors when moving end of live pointers
git-svn-id: trunk@22113 -
2012-08-17 19:36:16 +00:00
florian
53a0d3e3a3 * fixed typo when checking live start of references
git-svn-id: trunk@22112 -
2012-08-17 19:36:10 +00:00
florian
5ceeb8aaa9 * enable scheduler when compiling at least with -O2
git-svn-id: trunk@22111 -
2012-08-17 19:36:04 +00:00
florian
a693fe9fb7 + implemented TCpuPreRegallocScheduler.SwapRegLive and make use of it to be able to reschedule instructions before register allocation
git-svn-id: trunk@22110 -
2012-08-17 19:35:59 +00:00
florian
7e5b8584cf * set MaxOps to 4 for the optimizer because fpc generates now mla instructions
git-svn-id: trunk@22106 -
2012-08-17 12:38:59 +00:00
florian
354cac2bb6 + completed arm architectures
* ldrd/strd and pld collected under the edsp define

git-svn-id: trunk@22104 -
2012-08-17 10:37:27 +00:00
florian
7588896775 * make use of cpuflags in the arm compiler
* armv5te architecture

git-svn-id: trunk@22103 -
2012-08-17 10:37:17 +00:00
florian
e4f89fe524 + introduce cpuflags for arm
git-svn-id: trunk@22090 -
2012-08-15 15:49:05 +00:00
florian
ecb037ad79 + tarminnode.pass_1 to set expectloc correctly
git-svn-id: trunk@22074 -
2012-08-13 15:03:35 +00:00
florian
d2aa35e9de * throw an internal error if code generation depends on expectloc but expectloc and real loc do not match
git-svn-id: trunk@22073 -
2012-08-13 15:02:55 +00:00
florian
33f287d320 + tarminnode.in_smallset making use of tst
git-svn-id: trunk@22064 -
2012-08-11 22:10:45 +00:00
florian
19debd87cc * start with a qword aligned frame pointer to enable more ldrd/strd optimizations
git-svn-id: trunk@22061 -
2012-08-11 15:12:19 +00:00
florian
371ef7bada * cover more cases in AlignedToQWord
git-svn-id: trunk@22060 -
2012-08-11 15:11:43 +00:00
florian
db7e029574 * strd/ldrd optimization might be only done on dword operations
git-svn-id: trunk@22059 -
2012-08-11 15:11:10 +00:00
florian
8c45a909be + support ldr/ldr -> ldrd and str/str -> strd optimization where appliable
git-svn-id: trunk@22058 -
2012-08-11 11:45:54 +00:00
florian
4d86d25c6c * -O4 switch for optimizations which are correct but which might have unexpected effects
like field reordering (possible problems cracker classes) or using ebp as normal register (broken
      stack traces from dump_stack)
    + niln is also valid in a cse domain
    * parameters passed by reference shall have a complexity >1
    * load nodes from outer scopes shall have a complexity >1
    * better cse debugging
    + more node types added to cse
    * consider parameters passed by reference in cse
    * take care of cse in parameters in simple cases

git-svn-id: trunk@22050 -
2012-08-09 18:58:54 +00:00
florian
b330bba0bc + introduce -Oofastmath
* limit the application of the tree transformation introduced in r21986 to safe cases and -Oofastmath

git-svn-id: trunk@22040 -
2012-08-08 19:35:45 +00:00
masta
aa21845cd9 Small optimization for OP_AND on ARM
Especially with 64bit operators the CG sometimes generates:
and r0, r1, #0
Which just clears r0 and is equivalent with
mov r0, #0

git-svn-id: trunk@22032 -
2012-08-08 06:44:20 +00:00
florian
7513291ad8 * generate different code for OS_S8 -> OS_16 conversion which might fold better, idea by Nico Erfurth
git-svn-id: trunk@22027 -
2012-08-07 19:36:46 +00:00
masta
6529307d9e Don't emit useless AND/BICs in ARM CG
In certain cases the CG would emit something like
bic r1, r0, #0
As BIC is clearing the specified bits this is equivalent to
mov r1, r0
This patch changes the CG to emit the mov instead which the register
allocator will hopefully remove most of the time.

git-svn-id: trunk@22024 -
2012-08-07 06:46:45 +00:00
masta
9e039936bf Support more operators in FoldShiftProcess on ARM
Now we can also fold shifts into teq, tst, cmp, cmn instructions.

git-svn-id: trunk@22023 -
2012-08-07 06:46:32 +00:00
florian
f619a1aaf6 * fld/fst can have a base register+offset
git-svn-id: trunk@22016 -
2012-08-05 18:34:13 +00:00
florian
e81ba0f82e + make use of the armv6+ sign/zero extension instructions if appropriate
git-svn-id: trunk@22013 -
2012-08-05 14:04:11 +00:00
florian
eb1efdff8a + introduce cstylearrayofconst because pocall_mwcall was forgotten at several places
git-svn-id: trunk@22012 -
2012-08-05 08:48:23 +00:00
florian
19ed835f2b * don't generate an extra indirection when loading vfp constants
git-svn-id: trunk@22010 -
2012-08-04 17:01:57 +00:00
masta
8a684c1f10 Don't generate IT instruction in second_cmp64bit for Thumb-2
Currently the register spiller can not handle the "bond" between IT* and
a following instruction, sometimes breaking them apart, which breaks the
build or worse the result.

So for now we're not emitting A_IT* in second_cmp64bit anymore but use a
conditional jump instead.

This fixes Mantis #22520

git-svn-id: trunk@22009 -
2012-08-04 16:55:58 +00:00
masta
1c51b8d906 Disable 64bit shifts for thumb2 - Fix for Mantis #22520
In r21686 I've introduced optimized 64bit shifts for ARM. But the
methods did not check for which machine it has to generate the code.

This patch disables the optimized code for now if the target is in
cpu_thumb2 and falls back to the generic code.

There are 2 problems with the current code:

1.) Thumb-2 does not support shift by register on all data instruction
as ARM does.
2.) The code does not generate the required IT-block for the conditional
executed code.

git-svn-id: trunk@21997 -
2012-08-02 00:56:21 +00:00
masta
c16871e129 Generate better code in Tthumb2cgarm.g_flags2reg
The old code generated a strange IT-sequence:

IT EQ
MOVEQ r0, #1
IT NE
MOVNE r0, #1

Now we generate:

ITE EQ
MOVEQ r0, #1
MOVNE r0, #1

IT stands for IfThen, ITE for IfThenElse it has a couple of other forms
where the instruction gets extended to handle more of the following
instructions. So we have ITEE, ITETE etc, up to 4 instructions can be
handled.

git-svn-id: trunk@21996 -
2012-08-02 00:56:15 +00:00
florian
023d632f44 * optimize also lsr/asr, lsl, lsr/asr sequences on arm
git-svn-id: trunk@21981 -
2012-07-28 22:30:11 +00:00
florian
283afbcb07 * new controllers by lelekx, resolves #22523
git-svn-id: trunk@21980 -
2012-07-28 21:57:29 +00:00
florian
c5ad1bce7b * avoid uncessary zero extensions in case code
git-svn-id: trunk@21979 -
2012-07-28 20:09:21 +00:00
florian
c8435b503f * better folding of consecutive shift operations
git-svn-id: trunk@21978 -
2012-07-28 17:59:45 +00:00
florian
614afc1c8f * pass march to GNU AS for cpu_armv6 and cpu_armv7
git-svn-id: trunk@21958 -
2012-07-23 20:20:17 +00:00
Jonas Maebe
0a1157da38 * fixed memory leaks in the compiler introduced in r21862 by marking and
releasing temporarily created function result locations

git-svn-id: trunk@21953 -
2012-07-23 13:49:29 +00:00
florian
d5aa89449e * generate less register wasting code for 64 bit comparions
git-svn-id: trunk@21950 -
2012-07-22 21:07:33 +00:00
masta
be6bf6e3f7 Fix possible access violation introduces in r21885
r21885 added a new peephole optimizer. The associated code refactoring
missed a check for

  tai(hp1).typ = tai_instruction

Which can lead to an access violation later on, because the rest of the
code expects to find a taicpu in hp1.

git-svn-id: trunk@21949 -
2012-07-22 18:06:08 +00:00
Jonas Maebe
3798b79fd7 + optimization that (re)orders instance fields of Delphi-style classes in
order to minimise memory losses due to alignment padding. Not yet enabled
    by default at any optimization level, but can be (de)activated separately
    via -Oo(no)orderfields
   o added separate tdef.structalignment method that returns the alignment
     of a type when it appears in a record/object/class (factors out
     AIX-specific double alignment in structs)
   o changed the handling of the offset of a delegate interface
     implemented via a field, by taking the field offset on demand
     rather than at declaration time (because the ordering optimization
     causes the offsets of fields to be unknown until the entire
     declaration has been parsed)

git-svn-id: trunk@21947 -
2012-07-22 16:47:19 +00:00
masta
aa4fe66153 Fix ARM ASM-reader for MVN/CMP/CMN/TST/TEQ
Like MOV these instructions support 2 operands, with the second beeing a
shifterop.

Without this patch the asm reader would fail on something like

cmp r0, r1, lsr 16

with

Error: Unknown identifier "LSR"

git-svn-id: trunk@21911 -
2012-07-15 01:03:08 +00:00
masta
e2a744e19b Consolidate do_spill_read/do_spill_written on arm
ARM can not reference an arbitrary offset so it needs some special
handling if the offset goes beyond abs(4095).

The code for do_spill_read and do_spill written used to be very similar.
I've partially factored out the code into spilling_create_load_store.

The former code loaded the offset from a constant pool, which is a waste
of memory-bandwidth and cache lines. The new code tries to find a way to
adjust the baseregister so the memory location can be reached more
easily, this allows us to handle at least +-1MB with just a single
additional ADD or SUB instruction. If that fails we'll resort to the normal
constant loading code, which on it's own will fallback to loading the
constant from a constant-pool.

So instead of:
ldr r1, =16388
ldr r0, [r13, r1]

which will at least uses 4 cycles (2 Instruction cycles + 2 stall
cycles) on most cores.

We try to generate:
add r1, r13, #16384
ldr r0, [r1, #4]

which most armv5+ cores will execute in 2 cycles. We'll also save on
DCache usage.

git-svn-id: trunk@21889 -
2012-07-12 01:11:23 +00:00
florian
701a5d76bb * remove unneeded movs
git-svn-id: trunk@21885 -
2012-07-11 20:58:52 +00:00
masta
57b67dfa30 Better SP adjustments on entry/exit for ARM
If the needed adjustment is not expressible in a shifterconst, the old code
loaded a temporary register (fixed to r12) via a_load_const_reg and used it
to adjust the SP. Resulting in:

mov r12, #44
orr r12, r12, #4096
sub sp, sp, r12

The new code will try to split the adjustment into 2 shifterconstants and
will do two seperate adjustments:

sub sp, sp, #44
sub sp, sp, #4096

If that doesn't work we'll fall back to the old code. But that should
happen VERY rarely, only for stacks bigger than 256k which are not
expressible in 2 shifter constants.

git-svn-id: trunk@21863 -
2012-07-11 08:41:45 +00:00
florian
95732625cc * use r11 as a normal register if no frame pointer is needed
git-svn-id: trunk@21834 -
2012-07-09 17:17:23 +00:00