Use UXTH+UXTB instructions instead of two shifts on processors that supports that.
Eliminate internalerror when constant pointers are typecast as arrays.
git-svn-id: trunk@26647 -
The load scheduler does not handle LDRD correctly right now, but it does
not prevent A_LDR with PF_D set from beeing scheduled.
git-svn-id: trunk@26637 -
Some time ago we introduced GetNextInstructionUsingReg, which might
return an instruction a couple of instructions away from our current
location. Most of the code then just returned the new instruction (hp1)
instead of the instruction following p. This could prevent the peephole
optimizer from finding possible optimizations.
git-svn-id: trunk@26605 -
The existing LdrLdr2LdrMov optimizer will generate a lot of
sequences like this:
ldr regA, [...]
mov regB, regA
ldr regB, [regB, ...]
this now gets changed to
ldr regA, [...]
ldr regB, [regA, ...]
this saves an instruction and might open up more possibilities for the load scheduler.
git-svn-id: trunk@26603 -
LDRD and STRD only have the first even numbered register in their instruction operands,
this additional code will also check for the register following it.
Example:
ldrd r0, [r13]
The old code will only detect r0 as in use, not the implicit r1.
git-svn-id: trunk@26602 -
Fixed a bug where ARMv7M targets would not use the DIV instructions.
Moved many size-optimizing Thumb2 peephole optimizations to PostPeepHoleOptsCpu. Previously those optimizations could make it impossible to reuse the shared arm peephole optimizations.
Reenabled a fixed MLA/MLS peephole optimization.
Refactored some FindRegDealloc+regLoadedWithNewValue into RegEndOfLife calls.
Fixed some broken UXTB/UXTH optimizations. Previously they would also match UXT* instructions with ROR shifter ops.
git-svn-id: trunk@26198 -
The previous code deleted the newly inserted instruction instead of the
existing one, which obviously broke code.
Assembly:
mov r0, r0, lsr #23
mov r0, r0, lsr #23
transformed into:
mov r0, r0, lsr #23
expected was:
mov r0, #0
The problem only shows up in the very unlikely case of two LSR/ASR or
two LSL following on each other and having a total shift of more than 31
bits.
This fixes test/opt/tarmshift.pp
I've also removed the {%norun} directive from tarmshift.pp as this test
does only make sense when it also runs.
git-svn-id: trunk@25374 -
The optimizer now juggles around the base and index register if that opens
up the possibility of folding the shift into the instruction.
This can only be done in the case of addressmode=AM_OFFSET, in case of
[AM_POSTINDEXED, AM_PREINDEXED] we can not move the base register, as this
would cause havoc and destruction.
git-svn-id: trunk@24645 -
ShiftShiftShift2ShiftShift tried to access a wrong and already freed
instruction the find out whatever a shift will result in a 0 result.
For some reason this only resulted in a bug on x86_64 linux host
crosscompiler builds.
git-svn-id: trunk@23624 -
This one folds
mov r1, r2, lsl #2
ldr/ldrb r0, [r0, r1]
into
ldr/ldrb r0, [r0, r2, lsl #2]
There is still some room for improvement, maybe it would be better to do this before
the register allocator runs, as we'll currently waste a register (r1 in the above example)
in many cases. That would also allow to to fold more operations, because currently if r2
gets reused between the mov and ldr we'll not be able to do the optimization.
git-svn-id: trunk@23408 -