Commit Graph

1442 Commits

Author SHA1 Message Date
tg74
1bc0ecec11 bugfix opcode definition vfmadd132pd/ps
git-svn-id: branches/tg74/avx512@39751 -
2018-09-13 11:04:09 +00:00
tg74
ac26adf7c9 bugfix avx512-opcodes
git-svn-id: branches/tg74/avx512@39745 -
2018-09-12 13:59:29 +00:00
tg74
dd967ecfee remove any gather/scatter opcodes for nights mill
git-svn-id: branches/tg74/avx512@39742 -
2018-09-12 09:59:04 +00:00
tg74
608992ecf5 minor bugfixes avx512 tests
git-svn-id: branches/tg74/avx512@39740 -
2018-09-12 05:03:05 +00:00
tg74
c611e4814a new avx512 opcodes
git-svn-id: branches/tg74/avx512@39720 -
2018-09-10 06:19:45 +00:00
tg74
1d9cbb4dcb new AVX512 opcodes
git-svn-id: branches/tg74/avx512@39705 -
2018-09-03 05:40:44 +00:00
tg74
914e31dbd1 new AVX512 instructions vextracti..,vextractf..
git-svn-id: branches/tg74/avx512@39674 -
2018-08-27 06:06:27 +00:00
tg74
6985da744b change x86ins.dat vmovq for test tasm9
git-svn-id: branches/tg74/avx512@39654 -
2018-08-20 13:31:42 +00:00
tg74
2b1da37d66 new avx512 instructions and bugfixes avx512
git-svn-id: branches/tg74/avx512@39636 -
2018-08-19 10:18:32 +00:00
tg74
867d145e50 support vector operand bcst,{sae},{er} + k-register
git-svn-id: branches/tg74/avx512@39457 -
2018-07-16 17:06:57 +00:00
tg74
4dc5442fa5 support vector operand writemask,zeroflag
git-svn-id: branches/tg74/avx512@39359 -
2018-07-02 20:20:03 +00:00
florian
78943ea843 + patch by J. Gareth Moreton: x86 optimisations for Jcc and SETcc, resolves #33899
* optimization also added for i386

git-svn-id: trunk@39307 -
2018-06-25 20:40:05 +00:00
florian
af37ca8563 - remove SetccMovbLeaveRet2SetccLeaveRet optimization, this type of code is not generated anymore for years
git-svn-id: trunk@39306 -
2018-06-25 20:40:04 +00:00
tg74
31e4d4ef5e AVX512 support for MMRegister xmm16..31 and ymm16..31, zmm0..31, vpaddsb support AVX512
git-svn-id: branches/tg74/avx512@39196 -
2018-06-08 06:53:35 +00:00
nickysn
c530e031b1 * synchronize get_saved_registers_int and get_volatile_registers_int for all
calling conventions on i386
* generated code at the caller side for pocall_pascal routines on i386 no longer
  assumes the routine destroys all registers (except ebp) - instead now it
  assumes that it preserves the ebx,esi,edi and ebp registers. This is
  compatible with the pascal calling convention of 32-bit delphi and was already
  honoured by FPC on the callee side.
* updated the list of calling conventions that save all registers, used in
  tx86callnode.can_call_ref, so it is accurate on all x86 platforms - i8086,
  i386 and x86_64.

git-svn-id: trunk@38904 -
2018-05-04 16:16:24 +00:00
nickysn
80a9dab99a * introduce a case statement, based on the calling convention in i386's
tcpuparamanager.get_saved_registers_int

git-svn-id: trunk@38902 -
2018-05-04 15:03:01 +00:00
Jonas Maebe
d69ad8fa41 * removed temppos field again from parameter locations: they're not allocated
by the temp manager of the current procedure

git-svn-id: trunk@38858 -
2018-04-27 19:18:55 +00:00
Jonas Maebe
4686f61002 * keep track of the temp position separately from the offset in references,
so that they can still be freed after the reference has been changed
    (e.g. in case of array indexing or record field accesses) (mantis #33628)

git-svn-id: trunk@38814 -
2018-04-22 17:03:16 +00:00
nickysn
518cdf9674 * replaced the saved_XXX_registers arrays with virtual methods inside
tcpuparamanager, very similar to the existing get_volatile_registers_XXX. The
  new methods are called get_saved_registers_XXX, where XXX is the register
  type ("int", "address", "fpu" or "mm")

git-svn-id: trunk@38794 -
2018-04-19 21:22:16 +00:00
nickysn
3318703ece * moved nf_typedaddr to addrnodeflags (anf_typedaddr)
git-svn-id: trunk@38671 -
2018-04-03 16:41:01 +00:00
florian
8b0bbdcaab * fix flag subregs after r38206
git-svn-id: trunk@38502 -
2018-03-11 20:30:11 +00:00
florian
9b18e39c81 * enable Lea2AddBase and Lea2AddIndex in TX86AsmOptimizer.PostPeepholeOptLea as we have flag tracking now
* some flag allocations fixed

git-svn-id: trunk@38501 -
2018-03-11 20:30:09 +00:00
florian
47927f053a * factored out TX86AsmOptimizer.OptPass1SHLSAL
git-svn-id: trunk@38498 -
2018-03-11 14:35:19 +00:00
marco
f0042a4719 * vcmppd hardcoded primitives like vcmpeqpd.
* required increasing maxinfolen to 9 

git-svn-id: trunk@38404 -
2018-03-03 23:32:54 +00:00
marco
f21a141144 * mantis #32001, add 32 vcmpps variants.
git-svn-id: trunk@38403 -
2018-03-03 23:10:03 +00:00
florian
8c5606b41d + support mmx shifting
git-svn-id: trunk@38367 -
2018-02-27 21:40:12 +00:00
florian
91514da267 * factored out TX86AsmOptimizer.PostPeepholeOptCall
+ use TX86AsmOptimizer.PostPeepholeOptCall on x86-64

git-svn-id: trunk@38278 -
2018-02-17 23:25:01 +00:00
florian
c1de51454c ti386shlshrnode.second_64bit:
* do not reused register
  * copy data to a new register only if really necessary

git-svn-id: trunk@38266 -
2018-02-17 12:45:15 +00:00
florian
31f78ea2b6 + implementation of the vectorcall calling convention by J. Gareth Moreton
+ tests

git-svn-id: trunk@38206 -
2018-02-11 17:50:37 +00:00
florian
73fda1ccb6 * factored out OptPass1Sub
+ make use of OptPass1Sub on x86_64 and i8086 as well

git-svn-id: trunk@37572 -
2017-11-10 20:55:22 +00:00
nickysn
ae92973196 + added support for the retw, retnw, retfw, retd, retnd, retfd, retq, retnq and
retfq x86 instructions. These are variants of the ret instruction with the
  return offset size set explicitly, e.g. retfw is a 16-bit far ret (i.e. pops
  a 16-bit offset and a 16-bit segment), retfd is a 32-bit far ret (pops a
  32-bit offset, followed by a 16-bit segment), etc.

git-svn-id: trunk@37571 -
2017-11-10 16:53:29 +00:00
pierre
ba3afefa4c Regenerate register include files after commit 37564: Fix value of NR_DR6 and NR_DR7
git-svn-id: trunk@37565 -
2017-11-07 07:30:42 +00:00
florian
2140b586a6 * i386 building fixed
git-svn-id: trunk@37554 -
2017-11-04 19:42:08 +00:00
florian
4da4b768ec * factored out PostPeepholeOptTest
+ use PostPeepholeOptTest on x86-64

git-svn-id: trunk@37551 -
2017-11-04 19:10:14 +00:00
florian
3097eaf8ee * made PostPeepholeOptMov a function
git-svn-id: trunk@37550 -
2017-11-04 19:10:12 +00:00
florian
a7ea7fb569 * factored out PostPeepholeOptCmp
+ use PostPeepholeOptCmp for x86_64

git-svn-id: trunk@37549 -
2017-11-04 19:10:09 +00:00
nickysn
80226e3af4 + added an optimization pass, that optimizes x86 references
git-svn-id: trunk@37494 -
2017-10-20 15:55:55 +00:00
nickysn
e8bbc4eef9 + support the xlat x86 instruction syntax with a memory operand. This allows
specifying the address size (e.g. xlat byte ptr [bx] or xlat byte ptr [ebx])

git-svn-id: trunk@37478 -
2017-10-17 16:40:06 +00:00
nickysn
0fb79946a5 + added support for the parameterized versions of the x86 string instructions
(movs, cmps, scas, lods, stos, ins, outs) in the inline asm of the i8086, i386
  and x86_64 targets. Both intel and at&t syntax is supported.
* NEC V20/V30 instruction 'ins' (available only on the i8086 target, because it
  is incompatible with 386+ instructions) renamed 'nec_ins', to avoid conflict
  with the 186+ 'ins' instruction.

git-svn-id: trunk@37446 -
2017-10-12 00:07:02 +00:00
nickysn
92a52a9f4d + implemented support for instructions with non-native address size on i8086
(16-bit and 32-bit), i386 (16-bit and 32-bit) and x86_64 (32-bit and 64-bit).
  Known bug: 32-bit addresses with an offset have their offset truncated to its
  low 16-bits on i8086

git-svn-id: trunk@37409 -
2017-10-06 15:27:14 +00:00
florian
15b617546e + call TX86AsmOptimizer.OptPass1VOP for logical operations as well
git-svn-id: trunk@37367 -
2017-10-01 14:40:21 +00:00
nickysn
aec03309ef + added CPUX86_HAS_SSE2 to x86 tcpuflags
git-svn-id: trunk@37326 -
2017-09-26 16:02:56 +00:00
nickysn
e701fa8de1 * converted the x86 instruction flags to a set, so they can be extended more
easily and so that all the values are now available to the compiler
  (previously, there were several, which were mapped to the same value and thus
  were only used to make x86ins.dat easier to read)

git-svn-id: trunk@37299 -
2017-09-21 15:48:27 +00:00
nickysn
ab62e2237b * mark the sldt,syscall,sysenter,sysexit,sysret,andn,bextr,rorx,sarx,shlx and
shrx instructions as protected mode only

git-svn-id: trunk@37275 -
2017-09-20 15:43:23 +00:00
florian
05ecd784f2 * factored out OptPass1LEA and use it for x86-64 as well
+ LEAMov2LEA optimization

git-svn-id: trunk@37199 -
2017-09-13 20:40:32 +00:00
nickysn
aefa317474 + fast and branchless implementation of abs(int64) for i386
git-svn-id: trunk@37169 -
2017-09-10 17:25:47 +00:00
florian
22956c4393 + TX86AsmOptimizer.OptPass1OP
git-svn-id: trunk@36365 -
2017-05-28 13:49:43 +00:00
florian
912e6d129a * fix modification flags for *ROUND*
git-svn-id: trunk@36280 -
2017-05-21 11:12:57 +00:00
florian
ddfaf59626 * fix compilation with -Cr
git-svn-id: trunk@36276 -
2017-05-21 08:34:42 +00:00
florian
0f16f6d94d + OptPass1MOVXX
git-svn-id: trunk@36209 -
2017-05-14 20:59:10 +00:00
florian
535c990233 + OptPass1MOVAP
git-svn-id: trunk@36203 -
2017-05-13 21:48:44 +00:00
florian
f4a29bb75d * moved InstructionLoadsFromReg and RegReadByInstruction from TCpuAsmOptimizer (i386) to TX86AsmOptimizer
git-svn-id: trunk@36200 -
2017-05-13 09:58:25 +00:00
nickysn
efc5e339d0 * use an enum instead of integer constants to represent inline numbers
* compinnr.inc include file converted to a unit
* inline number field size stored in ppu increased from byte to longint
* inlines in the parse tree (when written with the -vp option) now printed with
  their enum name, instead of number

git-svn-id: trunk@36174 -
2017-05-10 14:41:43 +00:00
florian
b1dff29cbf * removed unused units
git-svn-id: trunk@36165 -
2017-05-09 19:53:14 +00:00
florian
52d3756c26 * factored out OptPass1Movx and merged i386 and x86-64 version
git-svn-id: trunk@36159 -
2017-05-08 20:44:27 +00:00
florian
06c4c651fd * factored out PrePeepholeOptSxx
+ x86-64 uses PrePeepholeOptSxx now as well

git-svn-id: trunk@36158 -
2017-05-08 20:44:24 +00:00
florian
dd69ab5488 * cleanup after all old code from PeepHoleOptPass2 of i386 was moved to the common x86 optimizer class
git-svn-id: trunk@36147 -
2017-05-07 16:18:37 +00:00
florian
cd134ea5bb + DebugMsg
git-svn-id: trunk@36146 -
2017-05-07 16:18:35 +00:00
florian
7afe762d22 * factored out OptPass2Jcc assembler optimization
* OptPass2Jcc now used by x86-64 as well
* remove orphaned alignments if the label is not used anymore after cmov is used

git-svn-id: trunk@36143 -
2017-05-07 12:45:48 +00:00
florian
e3f0b338d4 * SkipLabels moved to aoptutils
* factored out OptPass2Jmp assembler optimization
* OptPass2Jmp now used by x86-64 as well

git-svn-id: trunk@36141 -
2017-05-06 21:07:02 +00:00
nickysn
af48d176ec + precise flag information for the ucomiss,ucomisd,vucomiss and vucomisd x86 instructions
git-svn-id: trunk@36115 -
2017-05-05 13:41:43 +00:00
nickysn
0cd70844f1 + take into account the fact that lea doesn't read the segment register of its
reference in i386's TCpuAsmOptimizer.RegReadByInstruction

git-svn-id: trunk@36080 -
2017-05-04 14:13:53 +00:00
nickysn
d5d53e7017 * fixed operand order in the check for sse movsd in i386's
TCpuAsmOptimizer.RegReadByInstruction

git-svn-id: trunk@36003 -
2017-04-28 14:56:54 +00:00
nickysn
ff1ee6836d + fix RegReadByInstruction for the x86 MOVSD instruction
git-svn-id: trunk@35968 -
2017-04-27 14:42:08 +00:00
nickysn
b741e38f98 + precise handling for x86 conditions and their flag bits in i386's
TCpuAsmOptimizer.RegReadByInstruction

git-svn-id: trunk@35965 -
2017-04-27 12:07:48 +00:00
nickysn
0f010430cc + better precision in determining the registers, read by mul/imul/div/idiv in
i386's TCpuAsmOptimizer.RegReadByInstruction:
  * mul doesn't read edx (unless included in operand)
  * 8-bit mul and imul don't read ah (unless included in operand)
  * 8-bit div and idiv don't read edx (unless included in operand)

git-svn-id: trunk@35958 -
2017-04-26 16:17:31 +00:00
nickysn
916c09af55 + also check the register type when checking for specific integer registers in
i386's TCpuAsmOptimizer.RegReadByInstruction. Previously, the lack of this
  check could generate false reads on some other register types (e.g. mmx/xmm/
  flags, etc.), and this could worsen optimizations.

git-svn-id: trunk@35957 -
2017-04-26 15:25:38 +00:00
nickysn
618b6292ee + support testing for individual bits from the x86 flags register in i386's
TCpuAsmOptimizer.RegReadByInstruction()

git-svn-id: trunk@35956 -
2017-04-26 14:38:36 +00:00
nickysn
c8487c4150 + added individual bits of the x86 flags register as subregisters
git-svn-id: trunk@35955 -
2017-04-26 13:52:52 +00:00
nickysn
5f66f5cebb + distinguish between x86 flags subregisters: flags, eflags and rflags
git-svn-id: trunk@35953 -
2017-04-25 16:10:43 +00:00
nickysn
0c244046a9 * proper register change info for the movs,cmps and scas x86 string instructions
(movsd still todo, because of the overlap with the sse2 instruction)

git-svn-id: trunk@35929 -
2017-04-23 21:30:25 +00:00
nickysn
1d34e96064 + added x86 instruction flag Ch_RFLAGScc, indicating instructions that read
specific bits from the flags register, according to their condition (used by
  Jcc/SETcc/CMOVcc)

git-svn-id: trunk@35907 -
2017-04-22 22:07:05 +00:00
nickysn
1146b7c12c + added detailed information for individual flag bits use for most x86
instructions. Not used by the compiler yet, but may allow more
  optimizations in the future.

git-svn-id: trunk@35882 -
2017-04-21 23:03:33 +00:00
nickysn
869f395a31 + added knowledge to the compiler for the x86 instructions, that don't read
their input registers, in case both parameters are the same register (e.g.
  xor eax, eax; sub eax, eax; etc.)

git-svn-id: trunk@35861 -
2017-04-20 15:11:56 +00:00
nickysn
af235cae86 * use TEST CL,32 instead of TEST ECX,32 in the beginning of a 64-bit shl/shr
sequence on i386

git-svn-id: trunk@35856 -
2017-04-19 21:30:31 +00:00
nickysn
12a1ad66b2 + added the Ch_RDirFlag change attribute to the STOSx instructions (previously
was missing, due to the 3 attributes per instruction limit)

git-svn-id: trunk@35855 -
2017-04-19 20:23:24 +00:00
nickysn
9303a8f61a * changed the x86 TInsProp.Ch structure from a 3-element array to a pascal set;
this removes the limit of 3 Ch_XXX flags per instruction (thus allowing adding
  more precise flags, e.g. for tracking only certain bits of the flags register,
  etc.) and avoids the ugliness of having the Ch_None filler, which makes
  x86ins.dat less readable.

git-svn-id: trunk@35850 -
2017-04-19 16:48:35 +00:00
nickysn
189e49998c * fixes to the x86 instruction flags tracking attributes:
* AAA and AAS also read flags (AF)
  * CMC reads and writes flags (it inverts CF)
  * CMPSx and SCASx write flags
  * CMPSx, SCASx, LODSx, STOSx, MOVSx read the direction flag
  * NOT doesn't affect flags
  * REP isn't affected by and doesn't affect flags
  * REPE/REPNE/REPZ/REPNZ/REPC/REPNC don't write flags, only read them
  * ROL and ROR don't read flags
  * SAL doesn't read flags
  * SHLD and SHRD don't read flags

git-svn-id: trunk@35849 -
2017-04-19 15:42:50 +00:00
nickysn
e708a76f70 * some i386 optimizations for 64-bit SHL/SHR/SAR in tcg64f386.a_op64_const_reg:
* only use SHx/RCx when optimizing for size
  * use ADD reglo,reglo + ADC reghi,reghi for SHL by 1 on i386 and i486

git-svn-id: trunk@35841 -
2017-04-18 21:30:31 +00:00
nickysn
0264c4cace + implemented OP_SHR/OP_SHL/OP_SAR correctly in tcg64f386.a_op64_const_ref for
const values larger than 31

git-svn-id: trunk@35838 -
2017-04-18 16:02:48 +00:00
nickysn
d7b8d8dd54 * don't emit the "SUB ECX,32" instruction on i386, when doing a 64-bit shift by
reg, with a value >=32. The instruction is redundant, because the SHL/SHR
  instructions already AND mask the shift count by 31.

git-svn-id: trunk@35836 -
2017-04-18 15:09:20 +00:00
nickysn
03dfa07ebc + implemented OP_SHR/OP_SHL/OP_SAR in i386's tcg64f386.a_op64_reg_ref
git-svn-id: trunk@35834 -
2017-04-18 14:34:20 +00:00
nickysn
10d7603dce + implemented OP_SHR/OP_SHL/OP_SAR support in tcg64f386.a_op64_reg_reg
git-svn-id: trunk@35831 -
2017-04-18 12:24:46 +00:00
nickysn
7e8c89435f * avoid the AND instruction in the i386 shr64/shl64 code, by using TEST+JZ,
instead of CMP+JL

git-svn-id: trunk@35830 -
2017-04-18 11:36:48 +00:00
nickysn
a1ad705646 + allocate and free flags before and after the shl+rcl/shr+rcr/sar+rcr sequences
git-svn-id: trunk@35786 -
2017-04-13 11:58:51 +00:00
nickysn
cddb48bad4 + i386 implementation of a_op64_const_reg for OP_SHR,OP_SHL and OP_SAR; needed
by the in_shl/shr/sar_assign_x_y inline nodes

git-svn-id: trunk@35785 -
2017-04-13 11:54:19 +00:00
nickysn
6a710964f2 + i386 implementation of a_op64_const_ref for OP_SHR,OP_SHL and OP_SAR; needed
by the in_shl/shr/sar_assign_x_y inline nodes

git-svn-id: trunk@35784 -
2017-04-13 10:38:33 +00:00
nickysn
256dc546ac + implemented the in_neg_assign_x and in_not_assign_x inline nodes, which will
be used (TBD in a future commit) for optimizing x:=-x and x:=not x on CPUs
  that support performing these operations directly in memory (such as x86)

git-svn-id: trunk@35749 -
2017-04-07 16:02:40 +00:00
nickysn
6580dfee39 * generate better i386 code for 64-bit shl/shr, by masking the shift count by
63, instead of comparing it to 64 and branching. Note that, although this
  changes the behaviour of 64-bit shifts by values larger than 64 (when stored
  in a variable), it actually makes them consistent with both the code,
  generated on x86_64, as well as with 64-bit shift by constant on i386 itself.

git-svn-id: trunk@35727 -
2017-04-04 16:28:54 +00:00
nickysn
5cb724edd9 + added optimized implementation of a_op64_reg_ref for i386 as well; improves
generated code for inc(int64_var,int64_var) and dec(int64_var,int64_var)

git-svn-id: trunk@35660 -
2017-03-25 21:40:20 +00:00
Jonas Maebe
4c68ea1000 * use pocalls_cdecl and cstylearrayofconst more consistently instead of
ad hoc set constants containing varying number cdecl-like calling
    conventions
   o added pocall_sysv_abi_cdecl and pocall_ms_abi_cdecl to cstylearrayofconst
   o also allow C-style blocks with mwpascal instead of cdecl (mwpascal = cdecl
     with "const" = "constref" for record parameters)
   o did not touch cases related to name mangling and import/export names,
     because those are a real mess and easily break things left and right :/

git-svn-id: trunk@35479 -
2017-02-25 11:46:35 +00:00
florian
f68558b88c * factored out TX86AsmOptimizer.OptPass2Imul
git-svn-id: trunk@35252 -
2017-01-06 22:25:24 +00:00
Jonas Maebe
880d438704 * renamed t<cpuname>procinfo to tcpuprocinfo for all targets, so we can
inherit from it for LLVM without a thousand ifdefs

git-svn-id: trunk@35141 -
2016-12-16 22:41:21 +00:00
Károly Balogh
0cb555c07c syscalls: move the reference implementation of parseparaloc to paramgr. removes two identical copies from CPU specific code and enables basereg convention for AROS/x86_64. also, other minor fixes and cleanups in related code.
git-svn-id: trunk@35047 -
2016-12-03 19:00:41 +00:00
Károly Balogh
f5f895e2a3 syscalls: unify call reference creation across 4 different CPU archs. less copypasted code, brings x86_64 AROS support up to speed
git-svn-id: trunk@35034 -
2016-12-02 09:29:09 +00:00
Jonas Maebe
a25ebbba3e + added volatility information to all memory references
o separate information for reading and writing, because e.g. in a
     try-block, only the writes to local variables and parameters are
     volatile (they have to be committed immediately in case the next
     instruction causes an exception)
   o for now, only references to absolute memory addresses are marked
     as volatile
   o the volatily information is (should be) properly maintained throughout
     all code generators for all archictures with this patch
   o no optimizers or other compiler infrastructure uses the volatility
     information yet
   o this functionality is not (yet) exposed at the language level, it
     is only for internal code generator use right now

git-svn-id: trunk@34996 -
2016-11-27 18:17:37 +00:00
sergei
133fcb5ab2 * Fixed VMOVQ instruction encoding, now assembles correctly also in 32-bit code.
+ Test

git-svn-id: trunk@34949 -
2016-11-21 13:59:44 +00:00
sergei
ebe134febc * Fixed memory reference size for MOVSS instruction, Mantis #29954.
git-svn-id: trunk@34943 -
2016-11-21 03:31:25 +00:00
sergei
870fda34d5 * x86 AT&T reader and writer: cleaned up usage of attsufMM suffix:
* It is now only used to select size of vector instructions (i.e. 128 or 256 bits)
  * Scalar instructions reverted to use attsufINT suffix (selecting between 32 or 64 bits).
  * Additionally, vcvtsi2sd and vcvtsi2ss with rm64 operand are x86_64 only.

git-svn-id: trunk@34942 -
2016-11-21 02:07:13 +00:00
sergei
edf943a4f6 * Changed memory operand size for VMOVSS instruction to 32 bits, Mantis #29957.
git-svn-id: trunk@34918 -
2016-11-18 23:37:01 +00:00
florian
56252d59f0 + support for the PREFETCHTW1 instruction based on a patch by Emelyanov Roman, resolves #30933
git-svn-id: trunk@34917 -
2016-11-18 20:19:39 +00:00
svenbarth
fc5ce63134 * fix for Mantis #30832: instead of checking a procdef's struct for df_generic check the procdef itself, this way global generic methods or generic methods that are part of non-generic classes or records are caught as well.
+ added test

git-svn-id: trunk@34914 -
2016-11-18 14:01:03 +00:00
Károly Balogh
c7c37f66ed * refactored syscall types for unified naming,first bits of ARM AROS syscall support
git-svn-id: trunk@34806 -
2016-11-06 12:41:56 +00:00
Jonas Maebe
0afbe85aab * various memory reference alignment fixes
git-svn-id: trunk@34544 -
2016-09-20 21:43:19 +00:00
Károly Balogh
464ecab542 huge syscall support refactor for Amiga-likes. removed large chunks of ancient duplicated code, and in general tried to make the entire thing more maintainable and cleaner. also added support for AROS EAXBase syscall convention
git-svn-id: trunk@34416 -
2016-09-03 07:57:23 +00:00
yury
649823a246 * Removed unused vars.
git-svn-id: trunk@34405 -
2016-09-01 20:01:54 +00:00
Jonas Maebe
aa1be3276f - removed default value of _typ parameter of TAsmData.(Weak)RefAsmSymbol():
it was AT_NONE, which is invalid and should never be used
  * explicitly pass the correct value for all calls to those methods elsewhere
    in the compiler

git-svn-id: trunk@34250 -
2016-08-05 07:09:16 +00:00
Jonas Maebe
a0efde8167 * automatically generate necessary indirect symbols when a new assembler
symbol is defined
   o removed all places where AB_INDIRECT symbols were explicitly generated
   o only generate AB_INDIRECT symbols for AT_DATA on systems_indirect_var_imports
   o for some symbols an indirect symbol is always required (because they are
     dereferenced by code in RTL units) -> use new AT_DATA_FORCEINDIRECT type

git-svn-id: trunk@34165 -
2016-07-20 20:53:03 +00:00
Jonas Maebe
1cb8c0d00c * specify the def of assembler level symbols defined via
tasmdata.DefineAsmSymbol() and all routines that call it
   o will be used to automatically generate AB_INDIRECT sybols when
     necessary

git-svn-id: trunk@34164 -
2016-07-20 20:52:59 +00:00
florian
7f44774852 * i386 uses OptPass1And from aoptx86
git-svn-id: trunk@33936 -
2016-06-07 20:01:13 +00:00
florian
5e8e21c1be * factored out OpPass2MOV code, x86-64 uses it as well now
git-svn-id: trunk@33932 -
2016-06-06 21:18:24 +00:00
florian
e56147ac6e * integrated mov op mov -> op optimization in aoptx86
* isFoldableArithOp is in aoptx86 now

git-svn-id: trunk@33928 -
2016-06-06 21:18:18 +00:00
florian
ba54f7243e * moved all i386 mov peephole optimization code into OptPass1MOV
git-svn-id: trunk@33908 -
2016-06-04 19:34:18 +00:00
florian
20807f4148 * factored out V<Op> optimizations into OptPass1VOP
* call OptPass1VOP also for i386

git-svn-id: trunk@33878 -
2016-06-01 20:49:35 +00:00
florian
a7516dfb50 * fix modification information of VCOMISS and VCOMISD
git-svn-id: trunk@33874 -
2016-06-01 19:58:43 +00:00
florian
0c13f3ce3e * fix modification information for vand*
git-svn-id: trunk@33593 -
2016-05-01 12:00:25 +00:00
florian
bd54a11f1c + TX86AsmOptimizer.OptPass1VMOVAP for i386 and x86-64
+ new unit aoptutils which helpers for the assembler optimizer

git-svn-id: trunk@33587 -
2016-05-01 09:37:21 +00:00
florian
ec92bc3390 * case of identifiers fixed
* x86-64 uses also the mov $0,... -> xor optimization

git-svn-id: trunk@33553 -
2016-04-24 20:01:43 +00:00
florian
f0e75de730 * properly update allocation info of the involved register when carrying out an MovMovCmp2MovCmp optimization, resolves issue #30052
* few changed to make code more readable

git-svn-id: trunk@33551 -
2016-04-24 15:57:06 +00:00
florian
8d9f6bbe0b * disable some debugging code which does not work anymore due to the unification of the peephole optimizer
git-svn-id: trunk@33546 -
2016-04-22 20:31:25 +00:00
florian
77b4709e7a + i386 compiler tracks now flag usage if needed, so the mov $0,reg -> xor reg,reg transformation can be enabled
git-svn-id: trunk@33545 -
2016-04-22 19:44:26 +00:00
florian
3c2dab9878 * i386 peephole assembler uses largely the common peephole optimizer infrastructure, the resulting code is besides a few improvements the same
git-svn-id: trunk@33542 -
2016-04-21 20:14:01 +00:00
florian
a742df9035 * reverse merged r33524 as it is not safe as test results showed
--- Reverse-merging r33524 into '.':
U    compiler\i386\popt386.pas
U    compiler\x86\cgx86.pas
--- Recording mergeinfo for reverse merge of r33524 into '.':
 U   .

git-svn-id: trunk@33527 -
2016-04-17 11:33:29 +00:00
florian
f576b0c01b * make use of xor reg,reg by generating it directly instead of hoping for the peephole
optimizer which cannot do this properly due to missing information about flags. By doing
  so the size of the compiler executable gets reduced by ~1 %

git-svn-id: trunk@33524 -
2016-04-15 19:27:22 +00:00
florian
2dbcdbe466 + peephole optimizer: change jmp .L1 ... .L1: ret into ret
git-svn-id: trunk@33523 -
2016-04-15 19:11:43 +00:00
Károly Balogh
4ed3a3f09a * re-read the libbase already pushed on the stack for AROS syscalls, instead of trying to re-resolve it. should fix threadvar libbases on AROS.
git-svn-id: trunk@33455 -
2016-04-08 22:42:29 +00:00
florian
406e3c4ac1 + support xgetbv instruction, resolves issue #29958
git-svn-id: trunk@33418 -
2016-04-03 20:53:10 +00:00
florian
8d5cc3dfa4 * (extended and modified) patch by Emelyanov Roman to add suport of RDRAND, RDSEED and TSX instructions set, resolves issue #29893.
In comparison with the original patch, support for a i386 has been added as well as a test program. 
  Further, a small issue with xbegin has been fixed

git-svn-id: trunk@33375 -
2016-03-28 19:08:13 +00:00
nickysn
cf3230b100 - removed IF_CENTAUR and replaced it with IF_CYRIX. Rationale: only 3 Centaur -
specific instructions were marked as CENTAUR, all the others were marked
  CYRIX, so it wasn't an accurate flag at all

git-svn-id: trunk@33326 -
2016-03-25 17:01:11 +00:00
nickysn
5f87ac5d47 + added 486 to the list of supported CPUs on the i8086 and i386 targets
git-svn-id: trunk@33317 -
2016-03-23 15:07:56 +00:00
svenbarth
f297b00f5b Extend the x86 targets by the ability to handle indirect symbols.
x86/cgx86.pas, tcgx86:
  + new method make_direct_ref() which is used to convert an indirect reference into a direct one (uses the boolean field in_make_direct_ref to avoid recursive calls)
  * make_simple_ref: call make_direct_ref() before anything else
  * a_loadaddr_ref_ref: call make_direct_ref() (the loading could probably be folded into the loadaddr method, but for now that is sufficent)
i386/cgcpu.pas, tcg386:
  * a_loadaddr_ref_cgpara: call make_direct_ref(); the same remark as for a_loadaddr_ref_ref() applies here

git-svn-id: trunk@33280 -
2016-03-18 21:45:41 +00:00
svenbarth
77ede2ac9f i386/cgcpu.pas, tcg386:
* a_load_ref_cgpara: call make_simple_ref() before calling the base a_load_ref_cgpara()
x86/cgx86.pas, tcgx86:
  * a_loadfpu_ref_reg, a_loadfpu_reg_ref, g_concatcopy: call make_simple_ref() on the passed references

git-svn-id: trunk@33277 -
2016-03-18 21:22:04 +00:00
svenbarth
570607b1d1 * revert r33273; haven't seen that Florian has already assigned that to himself... Oops
git-svn-id: trunk@33274 -
2016-03-18 14:26:24 +00:00
svenbarth
e4fa7928f9 Fix for Mantis #29527.
i386/popt386.pas, PeepHoleOptPass1:
  * disable the call to RegLoadedWithNewValue() as that method isn't implemented for any of the x86 optimizers (but add a ToDo so that it isn't forgotten)

git-svn-id: trunk@33273 -
2016-03-18 14:24:55 +00:00
sergei
0f301b4c57 * Fixed spilling info for vcvt* instructions, part of Mantis #29783.
git-svn-id: trunk@33208 -
2016-03-09 16:36:30 +00:00
nickysn
80b3e3020a * the SEGFS and SEGGS prefixes are 386+
git-svn-id: trunk@32925 -
2016-01-11 15:51:40 +00:00
nickysn
741a3eedf9 * fixed the cpu level of several 186+ instructions, that were mistakenly marked as either 286+ or 8086+
git-svn-id: trunk@32921 -
2016-01-11 13:22:08 +00:00
nickysn
6037976202 * several imul variants, featuring 32-bit or 64-bit registers marked 386+, instead of 286+
git-svn-id: trunk@32889 -
2016-01-08 17:07:36 +00:00
nickysn
66bad5a1cf * pushf and popf are 8086 level instructions, not 186+
git-svn-id: trunk@32677 -
2015-12-17 15:23:21 +00:00
florian
a3964d9ee0 + support for RDTSCP, resolves issue #28916
git-svn-id: trunk@32652 -
2015-12-13 13:28:51 +00:00
sergei
a78250a78b * x87 FBSTP and FBLD instructions cannot have size suffix in ATT syntax. Mantis #29095.
git-svn-id: trunk@32541 -
2015-11-27 03:59:06 +00:00
yury
78b4950b97 * Fixed calling of external procs for i386 non-darwin targets when PIC is enabled.
git-svn-id: trunk@32536 -
2015-11-26 17:04:55 +00:00
Jonas Maebe
fa3b0ca312 * support marking defs created via the getreusable*() class methods as
"don't free even if not registered"; use for defs that may not be written
    to a ppu file, but that must nevertheless survive the compilation of the
    current module
  * mark all defs created for para locations as "don't free even if not
    registered", because we don't discard and recalculate all para locations
    after a module has been compiled (since that's not needed)
   o solves issues if the paralocations for a routine in the interface of
     unit A are calculated while the implementation of unit B gets
     compiled, and a new reusable type is allocated at that point which
     is not used anywhere else (after r32160)

git-svn-id: trunk@32235 -
2015-11-04 20:46:18 +00:00
yury
6537b99ac3 * i386: Fixed detection of a peephole optimization using CMOV.
git-svn-id: trunk@32115 -
2015-10-21 15:59:12 +00:00
yury
862348c317 * Keep the GOT offset in a virtual register for i386 non-darwin platforms.
It fixes PIC code generation with GOT for i386 with enabled optimizations. Bugs #28667, #28668. 
  Prior the fix I have not been able to compile even RTL with -O2 due to not enough free registers, since EBX is reserved for GOT.

  It can be further optimized to teach register allocator to not spill the GOT register if possible.
  

git-svn-id: trunk@32020 -
2015-10-12 08:02:56 +00:00
florian
c40240990e * popt386 uses now also all routines of aoptx86
git-svn-id: trunk@31894 -
2015-09-29 19:31:33 +00:00
florian
53ea4fb7d4 * unify x86 peephole optimizer helpers
git-svn-id: trunk@31843 -
2015-09-27 09:36:39 +00:00
svenbarth
529677cc79 ncal.pas:
* extend tcallnode with the ability to pass a tspecializationcontext so that tcallcandidates can do a final specialization
  * the final procdef is registered at the end of tcallnode.pass_typecheck

git-svn-id: trunk@31763 -
2015-09-18 14:48:54 +00:00
Jonas Maebe
3c6aa91a96 * factored out the loading of threadvars in its own method, and put the
x86-specific part in nx86ld

git-svn-id: trunk@31639 -
2015-09-12 23:32:53 +00:00
sergei
e542800ea9 * Win64 SEH: Track control flow out of unwind-protected regions in a more precise way and don't generate expensive calls to __fpc_local_unwind when not necessary.
git-svn-id: trunk@31582 -
2015-09-09 18:43:46 +00:00