- xorq %reg,%reg (identical registers) is now changed to xorl %reg,%reg if doing so removes the REX prefix.
- movw %bx,%ax; andl $0xffff,%eax, for example, is now changed to movzwl %bx,%eax as long as a conditional operation doesn't follow 'and' (checks to see if the CPU flags are in use).
- movzbq and movzwq get optimised to movzbl and movzwl respectively if doing so removes the REX prefix.
- Removal of optimisation code that zero-extends from 32-bit to 64-bit, because there isn't actually a valid combination of opcodes for MOVZX that allows that (for registers,
just use MOV). This is not the case with MOVSX.
- movq is now optimised to movl even if the CPU flags are in use (this stops mov %reg,0 from being optimised to xor %reg,%reg if doing so breaks an algorithm that relies on them).
- Fixed typo in peephole message regarding movq to movl (it said movd instead).
- Made the peephole debug messages more consistent in formatting, some of which now have more detail.
* small fixes of the patch
git-svn-id: trunk@38070 -
- Moved the part that emits the CMOV command outside of the if-else block, because it's the same in both branches and was just duplicated code.
- Moved a comment about powers of 2 to be right before the correct if-else block.
- Added a couple of comments to explain what the algorithm is doing to obtain the remainder.
- Added missing "writeln('ok');" (since 'tmoddiv3.pp' has it) and program header to 'tmoddiv4.pp'.
- Changed program name from "testfile2" to "tmoddiv3" in 'tmoddiv3.pp'.
git-svn-id: trunk@37939 -
from cpubase unit to a method in the tcg class. The reason for doing that is
that this is now a standard part of the 16-bit and 8-bit code generators and
moving to the tcg class allows doing extra checks (not done yet, but for
example, in the future, we can keep track of whether there was an extra
register allocated with getintregister and halt with an internalerror in case
GetNextReg() is called for registers, which weren't allocated as a part of a
sequence, therefore catching a certain class of 8-bit and 16-bit code
generator bugs at compile time, instead of generating wrong code).
- removed GetLastReg() from avr's cpubase unit, because it isn't used for
anything. It might be added to the tcg class, in case it's ever needed, but
for now I've left it out.
* GetOffsetReg() and GetOffsetReg64() were also moved to the tcg unit.
git-svn-id: trunk@37180 -
(i386 and x86_64) code generator (same as the division by a positive power of
2, followed by a NEG instruction, to invert the sign of the result; previously
the code generator generated an IMUL instruction with a magic constant,
followed by shift; the new code sequence should be both shorter and faster)
git-svn-id: trunk@37003 -
with calls to cg.a_op_const_reg in the x86 div code generator, so that the
same code can be used in the future for i8086 as well (SHR and SAR by
constants other than 1 are 186+, so on 8086 they have to go through the CL
register, which is handled correctly in cg.a_op_const_reg)
git-svn-id: trunk@36815 -
o separate information for reading and writing, because e.g. in a
try-block, only the writes to local variables and parameters are
volatile (they have to be committed immediately in case the next
instruction causes an exception)
o for now, only references to absolute memory addresses are marked
as volatile
o the volatily information is (should be) properly maintained throughout
all code generators for all archictures with this patch
o no optimizers or other compiler infrastructure uses the volatility
information yet
o this functionality is not (yet) exposed at the language level, it
is only for internal code generator use right now
git-svn-id: trunk@34996 -
resulttype of the div node set by the type checking pass, this is
also how the generic code generator handles it, resolves#27173
git-svn-id: trunk@29382 -
future use by high level code generator targets
o this in turn required that all a_load*_loc* methods are called via
hlcg rather than via cg, since a location can be a subsetref/reg and
and those are no longer handled in tcg
o that then required moving several force_location_* routines into
thlcg because they use a_load_loc*, but did not take tdef size
parameters (which are required by the thlcg a_load_loc* routines)
o the only practical consequence is that from now on, you have to
use hlcg.location_force_mem/reg() (fpureg not yet) and
hlcg.gen_load_loc_cgpara() instead of the removed versions from ncgutil,
and hlcg.a_load*loc*() instead of cg.a_load*loc* if a subsetref/reg
might be involved
git-svn-id: trunk@21287 -
+ RTL support:
o VFP exceptions are disabled by default on Darwin,
because they cause kernel panics on iPhoneOS 2.2.1 at least
o all denormals are truncated to 0 on Darwin, because disabling
that also causes kernel panics on iPhoneOS 2.2.1 (probably
because otherwise denormals can also cause exceptions)
* set softfloat rounding mode correctly for non-wince/darwin/vfp
targets
+ compiler support: only half the number of single precision
registers is available due to limitations of the register
allocator
+ added a number of comments about why the stackframe on ARM is
set up the way it is by the compiler
+ added regtype and subregtype info to regsets, because they're
also used for VFP registers (+ support in assembler reader)
+ various generic support routines for dealing with floating point
values located in integer registers that have to be transferred to
mm registers (needed for VFP)
* renamed use_sse() to use_vectorfpu() and also use it for
ARM/vfp support
o only superficially tested for Linux (compiler compiled with -Cpvfpv6
-Cfvfpv2 works on a Cortex-A8, no testsuite run performed -- at least
the fpu exception handler still needs to be implemented), Darwin has
been tested more thoroughly
+ added ARMv6 cpu type and made it default for Darwin/ARM
+ ARMv6+ implementations of atomic operations using ldrex/strex
* don't use r9 on Darwin/ARM, as it's reserved under certain
circumstances (don't know yet which ones)
* changed C-test object files for ARM/Darwin to ARMv6 versions
* check in assembler reader that regsets are not empty, because
instructions with a regset operand have undefined behaviour in that
case
* fixed resultdef of tarmtypeconvnode.first_int_to_real in case of
int64->single type conversion
* fixed constant pool locations in case 64 bit constants are generated,
and/or when vfp instructions with limited reach are present
WARNING: when using VFP on an ARMv6 or later cpu, you *must* compile all
code with -Cparmv6 (or higher), or you will get crashes. The reason is
that storing/restoring multiple VFP registers must happen using
different instructions on pre/post-ARMv6.
git-svn-id: trunk@14317 -
alignment for each memory reference (mantis #12137, and
test/packages/fcl-registry/tregistry1.pp on sparc). This also
enables better code generation for packed records in many cases.
o several changes were made to the compiler to minimise the chances
of accidentally forgetting to set the alignment of memory references
in the future:
- reference_reset*() now has an extra alignment parameter
- location_reset() can now only be used for non LOC_(C)REFERENCE,
use location_reset_ref() for those (split the tloc enum so the
compiler can catch errors using range checking)
git-svn-id: trunk@12719 -
* small cleanups of unused variables in firstpass
* node_resources_fpu() created to get an approximation of the
required fpu registers
* for the moment use node_complexity in the CG until the
node_resource_int() is created
git-svn-id: trunk@8655 -
* fixed downsizing the precision of floating point values
* floating point constants are now treated using only the minimal
precision required (e.g. 2.0 is now a single, 1.1 extended etc)
(Delphi compatible)
git-svn-id: trunk@5927 -
* symtables based on TFPHashObjectList and TFPObjectList
* rename torddef.typ to torddef.ordtype
* rename tfloatdef.typ to tfloatdef.floattype
* rename tdef.deftype to tdef.typ
* remove obsolete browser code, browcol is kept so the ide
can still be compiled
git-svn-id: trunk@5192 -