This patch improves the compiler where "case" statements are concerned, using jump tables more often and creating more efficient machine code in some situations:
* If a case block only contains one branch (not including the else block), the initial range check is removed, since this becomes wasted effort.
* If the else block is empty, the else label is set to the end label - though this doesn't decrease the code size, it takes a bit of strain off the peephole optimizer.
* On -O2 and above, some node analysis is now done on the branch labels. Most of the time this just redirects it to the end
label for empty blocks, but if the block contains a goto statement, it will redirect it to its destination instead,
thus increasing performance by not having multiple jumps (this won't get picked up by the peephole optimiser if the label addresses are in a jump table).
* Some checks now use what I call the 'true count' rather than the 'label count'. The true count includes each
individual value in a range - for example, 0..2 counts as 3. This increases the chance that a jump table will be
utilised in situations where it is more efficient than a linear list.
* For jump tables, if the case block almost covers the entire range (32 entries or fewer from full coverage),
the initial range check is removed and the gaps included in the jump table (pointing to the else label).
git-svn-id: trunk@40676 -
* disable matching volatile references in the assembler optimisers, so they
can't be removed (more conservative than needed, but better than removing
too many)
o the CSE optimiser will ignore them by default, because they're an unknown
inline node for it
* also removed no longer used fpc_in_move_x and fpc_in_fillchar_x inline node
identifiers from rtl/inc/innr.inc, and placed fpc_in_unaligned_x at the
right place
git-svn-id: trunk@40465 -
in the i8086-msdos 'ports' unit, but will be enabled on other targets (e.g.
go32v2) in the future as well. 32-bit 'in' and 'out' not inlined on i8086, but
will be on i386 and x86_64.
git-svn-id: trunk@39362 -
registers to -1 in x86reg.dat. The values that used to be there weren't used
at all (most were just copies of the 32-bit version of the register). This can
be easily demonstrated by the fact that running 'make regdat' in the compiler
directory doesn't change any of the generated files for i8086/i386/x86_64.
git-svn-id: trunk@39098 -
and filled it with the dwarf register mapping, used by Open Watcom (Watcom
also uses this mapping on i386, but we don't need to support their debugger on
i386 for now)
git-svn-id: trunk@39097 -
calling conventions on i386
* generated code at the caller side for pocall_pascal routines on i386 no longer
assumes the routine destroys all registers (except ebp) - instead now it
assumes that it preserves the ebx,esi,edi and ebp registers. This is
compatible with the pascal calling convention of 32-bit delphi and was already
honoured by FPC on the callee side.
* updated the list of calling conventions that save all registers, used in
tx86callnode.can_call_ref, so it is accurate on all x86 platforms - i8086,
i386 and x86_64.
git-svn-id: trunk@38904 -
so that they can still be freed after the reference has been changed
(e.g. in case of array indexing or record field accesses) (mantis #33628)
git-svn-id: trunk@38814 -
tcpuparamanager, very similar to the existing get_volatile_registers_XXX. The
new methods are called get_saved_registers_XXX, where XXX is the register
type ("int", "address", "fpu" or "mm")
git-svn-id: trunk@38794 -
* disable generation of RVA and SECREL32 symbols (according to comment in taiconst_type, they are win32 only)
* use lowercase cpu names (it was changed from case-insensitive names sometime after 2.10)
git-svn-id: trunk@38579 -
depending on the combination of operand types; this is done, so that adding
OPR_LOCAL with OPR_REFERENCE operands can be supported later.
git-svn-id: trunk@38443 -
syntax in the x86 intel syntax asm reader; this is preparation for support of
segment overrides inside the reference expression (i.e. [es:bx] instead of
es:[bx])
git-svn-id: trunk@38363 -
[expr1][expr2] = [expr1+expr2]
[expr1[expr2]] = [expr1+expr2]
This is compatible with TP7's inline asm, and perhaps also with tasm/masm/delphi.
git-svn-id: trunk@38352 -
record field). This makes e.g.
test [di + recordtype], 1
work and use the size of recordtype to determine the operand size; recordtype
itself is evaluated to 0, so if recordtype's size is 2 bytes, the above
instruction assembles as:
test word ptr [di], 1
Ugly, but TP7 compatible.
git-svn-id: trunk@38176 -
which allows const symbol expressions to also have a size sometimes. Why?
Because TP7 (and perhaps Delphi) allows not specifying the size in e.g.
test [di+recordtype.recordfield], 1
in this case, the operand size (byte ptr, word ptr, dword ptr, qword ptr) is
determined by the size of recordtype.recordfield; this already happens with
variables, but in this case, this is a type.field, which is resolved to a
constant.
This commit only adds a dummy 'size' parameter, which is always initialized to
0 and not used. The actual implementation of the above will follow in separate
commits.
git-svn-id: trunk@38173 -
tx86intreader.BuildConstSymbolExpression; it returns whether the 'OFFSET'
keyword has been used in the expression. This will be used for disambiguation
between 'dd xx' and 'dd offset xx', because they should produce different
results on i8086 (the first generates a far pointer, i.e. the same as
'dw xx, SEG xx', the second - a 32-bit offset)
git-svn-id: trunk@38147 -
- xorq %reg,%reg (identical registers) is now changed to xorl %reg,%reg if doing so removes the REX prefix.
- movw %bx,%ax; andl $0xffff,%eax, for example, is now changed to movzwl %bx,%eax as long as a conditional operation doesn't follow 'and' (checks to see if the CPU flags are in use).
- movzbq and movzwq get optimised to movzbl and movzwl respectively if doing so removes the REX prefix.
- Removal of optimisation code that zero-extends from 32-bit to 64-bit, because there isn't actually a valid combination of opcodes for MOVZX that allows that (for registers,
just use MOV). This is not the case with MOVSX.
- movq is now optimised to movl even if the CPU flags are in use (this stops mov %reg,0 from being optimised to xor %reg,%reg if doing so breaks an algorithm that relies on them).
- Fixed typo in peephole message regarding movq to movl (it said movd instead).
- Made the peephole debug messages more consistent in formatting, some of which now have more detail.
* small fixes of the patch
git-svn-id: trunk@38070 -
- Moved the part that emits the CMOV command outside of the if-else block, because it's the same in both branches and was just duplicated code.
- Moved a comment about powers of 2 to be right before the correct if-else block.
- Added a couple of comments to explain what the algorithm is doing to obtain the remainder.
- Added missing "writeln('ok');" (since 'tmoddiv3.pp' has it) and program header to 'tmoddiv4.pp'.
- Changed program name from "testfile2" to "tmoddiv3" in 'tmoddiv3.pp'.
git-svn-id: trunk@37939 -
inline assembly, and fixed check after r35959 (mantis #32318)
o can also subscript parameters passed by value on the stack
o can also subscript local variables, the parameters passed by reference
that are subsequently copied into a local
git-svn-id: trunk@37886 -
retfq x86 instructions. These are variants of the ret instruction with the
return offset size set explicitly, e.g. retfw is a 16-bit far ret (i.e. pops
a 16-bit offset and a 16-bit segment), retfd is a 32-bit far ret (pops a
32-bit offset, followed by a 16-bit segment), etc.
git-svn-id: trunk@37571 -
instructions got erroneously converted to 'jmp/call v', if 'v' is an external
far variable that points to certain things (like a local label, exported via
public)
git-svn-id: trunk@37538 -
warning (on the i8086 target) or an error (on i386 and x86_64) when this
instruction is used (because it only works on 8086 and 8088 CPUs)
git-svn-id: trunk@37514 -
default segment base for the ref, in case there's no segment override
* in the internal assembler, use get_default_segment_of_ref to strip redundant
prefixes, instead of always assuming all refs are DS-based
git-svn-id: trunk@37486 -
* taicpu.needaddrprefix now uses is_32_bit_ref on x86_64
* is_16/32/64_bit_ref made part of the aasmcpu unit interface, so they can be
used elsewhere (e.g. in the inline assembler readers)
git-svn-id: trunk@37469 -
specified to be (%esi) or (%edi), when using at&t syntax assembler (this is
not considered an error by intel syntax assemblers, so we're not adding a
warning there, for now)
git-svn-id: trunk@37458 -
* when generating x86 code for parameterized string instructions with the
internal object writer, don't rely on the destination operand being [(r/e)di]
when determining the segment prefix, because when using intel syntax, source
and destination can be anything (only the operand size, the address size and
the source segment is taken into account)
git-svn-id: trunk@37452 -
is_x86_parameterless_string_instruction_op and
is_x86_parameterized_string_instruction_op by removing 'instruction' from
their names
git-svn-id: trunk@37451 -
get_x86_string_op_size
* refactored the AT&T inline asm handling of x86 parameterized string ops, so it
uses the new helper functions
git-svn-id: trunk@37449 -
(movs, cmps, scas, lods, stos, ins, outs) in the inline asm of the i8086, i386
and x86_64 targets. Both intel and at&t syntax is supported.
* NEC V20/V30 instruction 'ins' (available only on the i8086 target, because it
is incompatible with 386+ instructions) renamed 'nec_ins', to avoid conflict
with the 186+ 'ins' instruction.
git-svn-id: trunk@37446 -
* changed most of the variables in the assembler readers used to store constants from aint to tcgint
as aint has only the size of the accumular while some CPUs (AVR) allow larger constants in instructions
+ allow access to absolute symbols with address type in inline assembler
* allow absolute addresses in avr inline assembler
+ tests
git-svn-id: trunk@37411 -
(16-bit and 32-bit), i386 (16-bit and 32-bit) and x86_64 (32-bit and 64-bit).
Known bug: 32-bit addresses with an offset have their offset truncated to its
low 16-bits on i8086
git-svn-id: trunk@37409 -
process_ea_ref_64_32, process_ea_ref_32 and process_ea_ref_16, indicating
the address size they support; this is done, so that in the future, we can
mix them all on the same x86 architecture and support multiple address sizes
git-svn-id: trunk@37407 -
o improves readibility of TX86AsmOptimizer.OptPass1MOV and fixes some spelling mistakes
+ Optimization MovAnd2Mov 2
+ extended Optimization MovTestJxx2TestMov and MovTestJxx2ovTestJxx to take care of and as well
+ Peephole Optimization: movq x,%reg -> movd x,%reg
git-svn-id: trunk@37377 -
* Disable asd_cpu for wasm (generates errors as it disabled the .xmm added at start)
- Remaining problem: 'DT inf' Error: not implemented ...
git-svn-id: trunk@37325 -
easily and so that all the values are now available to the compiler
(previously, there were several, which were mapped to the same value and thus
were only used to make x86ins.dat easier to read)
git-svn-id: trunk@37299 -
from cpubase unit to a method in the tcg class. The reason for doing that is
that this is now a standard part of the 16-bit and 8-bit code generators and
moving to the tcg class allows doing extra checks (not done yet, but for
example, in the future, we can keep track of whether there was an extra
register allocated with getintregister and halt with an internalerror in case
GetNextReg() is called for registers, which weren't allocated as a part of a
sequence, therefore catching a certain class of 8-bit and 16-bit code
generator bugs at compile time, instead of generating wrong code).
- removed GetLastReg() from avr's cpubase unit, because it isn't used for
anything. It might be added to the tcg class, in case it's ever needed, but
for now I've left it out.
* GetOffsetReg() and GetOffsetReg64() were also moved to the tcg unit.
git-svn-id: trunk@37180 -
(i386 and x86_64) code generator (same as the division by a positive power of
2, followed by a NEG instruction, to invert the sign of the result; previously
the code generator generated an IMUL instruction with a magic constant,
followed by shift; the new code sequence should be both shorter and faster)
git-svn-id: trunk@37003 -
with calls to cg.a_op_const_reg in the x86 div code generator, so that the
same code can be used in the future for i8086 as well (SHR and SAR by
constants other than 1 are 186+, so on 8086 they have to go through the CL
register, which is handled correctly in cg.a_op_const_reg)
git-svn-id: trunk@36815 -
the intel assembler reader: no longer parse them as register tokens,
but as local operands that are later converted into registers. This
ensures in particular that the type of the operand is set, which is
necessary in case this operand later subscripted (as in tasm10a)
git-svn-id: trunk@36288 -
determine whether it's in a register if it's a pure assembler routine
* you can't "index" implicit pointers either using their fields
git-svn-id: trunk@36287 -
register size for OP_SHR/OP_SHL/OP_SAR/OP_ROL/OP_ROR in tcgx86.a_op_reg_reg().
This is required for the in_[shr/shl/sar/rol/ror]_assign_x_y inline nodes.
git-svn-id: trunk@36251 -