The optimizer now juggles around the base and index register if that opens
up the possibility of folding the shift into the instruction.
This can only be done in the case of addressmode=AM_OFFSET, in case of
[AM_POSTINDEXED, AM_PREINDEXED] we can not move the base register, as this
would cause havoc and destruction.
git-svn-id: trunk@24645 -
ShiftShiftShift2ShiftShift tried to access a wrong and already freed
instruction the find out whatever a shift will result in a 0 result.
For some reason this only resulted in a bug on x86_64 linux host
crosscompiler builds.
git-svn-id: trunk@23624 -
This one folds
mov r1, r2, lsl #2
ldr/ldrb r0, [r0, r1]
into
ldr/ldrb r0, [r0, r2, lsl #2]
There is still some room for improvement, maybe it would be better to do this before
the register allocator runs, as we'll currently waste a register (r1 in the above example)
in many cases. That would also allow to to fold more operations, because currently if r2
gets reused between the mov and ldr we'll not be able to do the optimization.
git-svn-id: trunk@23408 -
Up until now we only checked the next instruction, with the new load
scheduler this is insufficient as shift-instructions and next usage
might farther apart.
The new version uses GetNextInstructionUsingReg, this also comes with a
price as we very carefully have to check if one of the used registers is
changed and that the usage of RRX will not break when we fold and flags
get changed in between.
git-svn-id: trunk@22876 -
Add simple Mul+Sub/Mul+Add into MLS/MLA optimizations
Fix some other small issues in the optimizer
Implement Interlocked* functions with proper use of LDREX/STREX
git-svn-id: branches/laksen/arm-embedded@22801 -
fixes a couple of arm-embedded stuff,
adds some controllers, start of fpv4_s16 support, for a complete list of
changes see below:
------------------------------------------------------------------------
r22787 | laksen | 2012-10-20 22:00:36 +0200 (Sa, 20 Okt 2012) | 1 line
Properly do NR_DEFAULTFLAGS detection/allocation/deallocation
------------------------------------------------------------------------
r22782 | laksen | 2012-10-20 07:44:55 +0200 (Sa, 20 Okt 2012) | 1 line
Fixed flags detections code for wide->short optimization code for Thumb-2
------------------------------------------------------------------------
r22778 | laksen | 2012-10-19 20:23:14 +0200 (Fr, 19 Okt 2012) | 1 line
Added coprocessor registers, and support for 6 operands(MCR/MRC instructions, etc)
------------------------------------------------------------------------
r22647 | laksen | 2012-10-14 21:28:08 +0200 (So, 14 Okt 2012) | 1 line
Added register specifications to lpc1768.pp. From Joan Duran
------------------------------------------------------------------------
r22646 | laksen | 2012-10-14 21:10:20 +0200 (So, 14 Okt 2012) | 4 lines
Fixed some minor formating issues
Implemented a small heap mananger
Implemented console IO
Changed default LineEnding to CrLf(to ease console IO parsing)
------------------------------------------------------------------------
r22599 | laksen | 2012-10-09 08:58:58 +0200 (Di, 09 Okt 2012) | 1 line
Added all STM32F1 configurations
------------------------------------------------------------------------
r22597 | laksen | 2012-10-08 22:10:45 +0200 (Mo, 08 Okt 2012) | 1 line
Added initial support for the Cortex-M4F FPv4_S16 FPU
------------------------------------------------------------------------
r22596 | laksen | 2012-10-08 22:04:14 +0200 (Mo, 08 Okt 2012) | 1 line
Added FPv4_d16 FPU instructions, and a few extra registers
------------------------------------------------------------------------
r22592 | laksen | 2012-10-08 16:07:40 +0200 (Mo, 08 Okt 2012) | 2 lines
Added support for IT block merging
Added a peephole pattern check for UXTB->UXTH chains
------------------------------------------------------------------------
r22590 | laksen | 2012-10-08 14:30:00 +0200 (Mo, 08 Okt 2012) | 3 lines
Add CBNZ/CBZ instructions
Create preliminary Thumb-2 PeepHoleOptPass2 code, hacked together from the ARM mode code
Added a number of simple size optimizations for common Thumb-2 instructions
------------------------------------------------------------------------
r22582 | laksen | 2012-10-08 06:49:39 +0200 (Mo, 08 Okt 2012) | 3 lines
Fix optimizations of Thumb-2 code
Fix problem with loading of condition operand for IT instructions
Properly split IT blocks when register allocator tries to spill inside a block.
------------------------------------------------------------------------
r22581 | laksen | 2012-10-08 05:15:40 +0200 (Mo, 08 Okt 2012) | 4 lines
Fixed assembler calling command line for cpus>ARMv5TE. EDSP instructions will generate errors while assembling, due to RTL assembler routines
Updated boot code for all Cortex-M3 controllers, and sc32442b to use weak linking for exception tables.
Cortex-M3 devices now also share initialization routine to simplify maintenance
STM32F10x classes now have specific units which fit the interrupt source names and counts
------------------------------------------------------------------------
r22580 | laksen | 2012-10-08 05:10:44 +0200 (Mo, 08 Okt 2012) | 2 lines
Added support for .section, .set, .weak, and .thumb_set directive for GAS assembler reader
IFDEF'ed JVM specific assembler directives, to prevent ait_* set to exceed 32 elements
------------------------------------------------------------------------
r22579 | laksen | 2012-10-08 02:10:52 +0200 (Mo, 08 Okt 2012) | 3 lines
Remove all traces of the interrupt vector table generation mechanism
Clean up cpuinfo tables
Fixed ARMv7M bug(BLX <label> doesn't exist on that version)
git-svn-id: trunk@22792 -
The function regLoadedWithNewValue returned true if the oper[0].reg
matched in an STR instruction, which is wrong as it will only be read.
git-svn-id: trunk@22623 -
Up until now DataMov2Data could be run on an strb generated by
AndStrb2Strb.
Code like this:
and reg0, reg1, #255
strb reg0, [r13]
mov reg2,reg1
would get transformed into:
strb reg2, [r13]
which is clearly wrong. The problem was that DataMov2Data expected that
it's first parameter is an instruction which loads new data into
oper[0]. With the introduction of AndStrb2Strb this wasn't true anymore.
This fix now checks if the first register is actually written to, this
is done by using regLoadedWithNewValue.
git-svn-id: trunk@22622 -
Create preliminary Thumb-2 PeepHoleOptPass2 code, hacked together from the ARM mode code
Added a number of simple size optimizations for common Thumb-2 instructions
git-svn-id: branches/laksen/arm-embedded@22590 -
Fix problem with loading of condition operand for IT instructions
Properly split IT blocks when register allocator tries to spill inside a block.
git-svn-id: branches/laksen/arm-embedded@22582 -
We don't need to check for the postfix, PF_NONE/PF_H/PF_B are all ok for us and
can be intermixed. This allows the peephole optimizer to work for
strb and strh instructions.
git-svn-id: trunk@22367 -
It is the same as the normal MatchInstruction function but supports to
be called with a set of TAsmOps instead of a single op.
git-svn-id: trunk@22231 -
1.) For UMULL and UMLAL support we would have to make sure the following
code checks RdHi and RdLo, which is currently not supported.
The former code would transform the following
umull r0, r1, r2, r3
cmp r0, #0
bne .LSomething
into
umulls r0,r1,r2,r3
bne .LSomething
which is wrong. UMULL has a 64bit result in r1+r0 and checks the full 64bit for 0
before setting the Z flag.
2.) Support MLA.
3.) Support MI/PL/NE/EQ for all instructions. As all of them are setting
the N and Z flags in the same way only based on the result of the
operation not on its input values.
N:=Result[31];
Z:=Result = 0;
Wurst
git-svn-id: trunk@22213 -
r21885 added a new peephole optimizer. The associated code refactoring
missed a check for
tai(hp1).typ = tai_instruction
Which can lead to an access violation later on, because the rest of the
code expects to find a taicpu in hp1.
git-svn-id: trunk@21949 -
This slightly changes the semantics of RegUsedAfterInstruction.
We now check if the `current value` of the register will be used later.
It will do `the right thing` for all the normal use cases.
git-svn-id: trunk@21519 -
The loop checked for the wrong instruction for .opcode = A_STR. Making
the whole optimizer non functional but at least not destructive.
git-svn-id: trunk@21508 -
This optimizer folds shift/roll operations into following data
instructions.
It will change code like:
mov r0, r0, lsl #16
add r1, r0, r1
into
add r1, r1, r0, lsl #16
Source registers will be reordered when necessary, also SUB/SBC will be
replaced with RSB/RSC and vice versa when reordering is required.
It could be expanded to support more operations like LDR/STR.
git-svn-id: trunk@21507 -
This changes the ARM Peephole optimizer RedundantMovProcess to also
recognize and modify something like the following sequence.
mov r0, r1
mov r0, r0, lsl #8
this would be changed into
mov r0, r1, lsl #8
git-svn-id: trunk@21506 -