Commit Graph

331 Commits

Author SHA1 Message Date
J. Gareth "Curious Kit" Moreton
33cf86ff9f PostPeepholeOptTestOr now removes TEST when dealing with POPCNT and LZCNT 2022-01-06 20:57:48 +00:00
J. Gareth "Curious Kit" Moreton
116c861af6 MOV/CMP optimisation is now in both Pass 1 and Pass 2 to catch more
eventualities
2021-12-31 14:28:35 +00:00
J. Gareth "Curious Kit" Moreton
8609c0803e Fixed MovxOp2Op failing on i386 due to lack of register check 2021-12-26 16:20:18 +00:00
J. Gareth "Curious Kit" Moreton
f289f2694a x86: Additions to OptPass2Movx to better synergise with new CMP optimisation under -O2 2021-12-25 19:07:48 +00:00
J. Gareth "Curious Kit" Moreton
683a92bcc8 i386: Correction to GetIntRegisterBetween to ensure we only get 8-bit registers that we can actually encode 2021-12-25 19:07:48 +00:00
J. Gareth "Curious Kit" Moreton
1da7ce46de x86: New double CMP optimisation to remove a branch 2021-12-25 19:07:48 +00:00
J. Gareth "Curious Kit" Moreton
cafd708b6d Refactoring of OptPass2Movx to remove goto 2021-12-25 16:38:10 +00:00
J. Gareth "Curious Kit" Moreton
22cd8d5d62 Fixed bug in MovxMovx2Movx optimisation that would specify a 64-bit destination instead of 32-bit one 2021-12-25 14:49:08 +00:00
J. Gareth "Curious Kit" Moreton
b4c8c1da12 Overflow bug fixes to MovZX/SX optimisations when CMP instructions are encountered. 2021-12-23 07:14:49 +00:00
florian
6dbe71cd30 * TX86AsmOptimizer.OptPass1MOVXX should search only over other instructions if it works with registers only 2021-12-22 22:54:11 +01:00
florian
6147d6d8a0 * compilation with i386 fixed 2021-12-21 22:46:12 +01:00
J. Gareth "Curious Kit" Moreton
d083cc7247 New MovxAndTest2Test optimisation to mirror the regular MovAndTest2Test optimisation 2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
5b4c104aaf Massive overhaul to OptPass2Movx to favour operand shrinkage 2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
d255ffba8b Improved handling of signed sequences in OptPass2Movx 2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
01e5f4855a MovZX->MovSX optimisation 2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
4825d2d16c New Movz ###,%ecx, shift/rotate %cl,... optimisation 2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
f02b7508de Bolder OptPass2Movx optimisations, including a simplification fix 2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
da899df6b2 New MovxMovxOp2OpMovx optimisation 2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
40196f4a43 Fixes to ADD/SUB 128 optimisation that didn't check flags properly, and also handling ADC/SBB properly 2021-12-19 20:51:57 +00:00
J. Gareth "Curious Kit" Moreton
b4bd15a5c0 Removed incorrect logic in TEST optimisation 2021-12-17 22:10:12 +00:00
J. Gareth "Curious Kit" Moreton
be448e29f6 Fixed bug in new TEST optimisation where a FLAGS check always returned "in use" 2021-12-15 20:14:26 +00:00
J. Gareth "Curious Kit" Moreton
42c429bf45 New optimisation that merges small constants written to the stack 2021-12-15 19:47:50 +00:00
J. Gareth "Curious Kit" Moreton
7a15312b54 Safety checks on TEST removals and better FLAG tracking 2021-12-13 16:11:33 +00:00
J. Gareth "Kit" Moreton
f60523a3b9 x86: New TEST optimisations 2021-12-12 21:40:42 +00:00
Yuriy Sydorov
7b2cd0bcdc * Prevent a range check error in case of big unsigned values. 2021-12-12 17:55:46 +02:00
J. Gareth "Curious Kit" Moreton
2dc0995067 - Bug fix to new ADD/SUB optimisation where conditions are concerned
- Register allocation fixes for overflow checks
2021-11-17 20:18:57 +00:00
J. Gareth "Curious Kit" Moreton
9f60628e5b x86: new optimisation to change add/sub 128,(dest) to sub/add -128,(dest) to reduce binary size 2021-11-14 21:38:38 +00:00
Pierre Muller
8e7791ac23 Explicitly disable overflow for offset propagation optimization 2021-11-08 22:55:44 +00:00
florian
7fcbd1d7e0 * my last commit hopefully fixed 2021-11-07 14:58:17 +01:00
florian
492d75483d * fix (V)Cvtss2CvtSd(V)Cvtsd2ss2* optmizations for non-avx code, resolves #39416 2021-11-07 14:46:13 +01:00
florian
44051b4af3 * corrected accidently made changs in 01a449c8, resolves #39424 2021-11-03 22:41:07 +01:00
J. Gareth "Curious Kit" Moreton
284317d877 Fixed OptPass2Lea not honouring symbols 2021-10-31 15:44:00 +00:00
J. Gareth "Curious Kit" Moreton
42eb06f5c6 Fixed some range check problems 2021-10-31 15:44:00 +00:00
J. Gareth "Curious Kit" Moreton
b58fdc3e58 Improved ADD and SUB optimisations for LEA instructions 2021-10-31 15:44:00 +00:00
florian
10fcae34a9 * improved TX86AsmOptimizer.OptPass1MOVXX 2021-10-24 18:38:23 +02:00
florian
4610980f2e * TX86AsmOptimizer.OptPass1MOVXX takes care of volatility 2021-10-23 23:40:09 +02:00
J. Gareth "Curious Kit" Moreton
342803532d Bug fix to MovMov2Mov 6 optimisation exposed by 4012c3dbd4 (and miscellaneous code refactors) 2021-10-22 22:39:46 +00:00
florian
ea6529ff63 * manually merged merge request 69 by J. Gareth "Kit" Moreton:
x86: CMP/MOV refactoring and expansion
      This merge request refactors the SwapMovCmp routine, and calls to it, to be more self-contained,
      having the preliminary checks built-in to ensure that moving the MOV instruction is
      actually a sound idea, while also making it more general-purpose so it can handle instructions
      that are not MOV operations. This feature is primarily for future expansion,
      but also cleans up the code for the x86 peephole optimizer.
2021-10-17 10:22:30 +02:00
florian
4012c3dbd4 * merge request 75 by J. Gareth "Kit" Moreton manually applied:
This merge request makes a number of improvements to the DeepMOVOpt method and supporting functions:

      * ReplaceRegisterInInstruction now replaces registers in references that are written to
        (since the registers themselves won't change)
      * RegModifiedByInstruction will no longer return True for a register that appears in a reference
        that's written to (for the same reason as above) - special operations like MOVSS
        (the 0-operand version) aren't affected.
      * DeepMOVOpt returning True will now always set the Result of OptPass1MOV to True even though p
        wasn't directly modified, since this often caused missed optimisations.
      * Some of the speed-ups in the patch from #32916 have also been applied in order to make
        the general DeepMOVOpt run faster, notably it tries to avoid calling UpdateUsedRegs where possible.
2021-10-17 09:50:47 +02:00
J. Gareth "Curious Kit" Moreton
fd28cc0db0 Better handling of zeroing upper parts of registers
Better handling of zeroing upper parts of registers
2021-10-16 14:42:19 +02:00
J. Gareth "Curious Kit" Moreton
674ed4069a Expanded MM block move to include YMM registers under AVX 2021-10-16 14:17:41 +02:00
florian
d55b2c2a35 + extend assembler optimization MovxMov2Mov to MovxOp2Op 2021-10-15 23:12:59 +02:00
florian
07413be8b5 + being able to define change information for xmm0
* corrected change information for SHA256RNDS2
2021-10-10 23:07:23 +02:00
J. Gareth "Curious Kit" Moreton
a925522ead xor optimisation now doesn't check to see if the REX prefix will actually be removed, as it's beneficial for speed reasons to only use the 32-bit register when zeroing the whole thing 2021-10-10 16:17:43 +00:00
florian
2c180cf101 * by default, DEBUG_AOPTCPU is only enabled if the compiler is compiled with -dEXTDEBUG 2021-10-10 15:35:38 +02:00
florian
b4bf371b34 * generate VMOVAPS for (V)Cvtss2CvtSd(V)Cvtsd2ss optimization, resolves #39360 2021-10-08 22:59:29 +02:00
florian
4752230c8f * use source register as second register in VCVTSD2SS and VCVTSS2SD, this should break
dependency chains better and resolves partially #39360
2021-10-07 23:16:39 +02:00
florian
ec40db3da7 + (V)Cvtss2CvtSd(V)Cvtsd2ss2Nop optimization, resolves #39360 2021-10-06 21:57:24 +02:00
florian
1e136b0cc7 * bail out early in MatchInstruction 2021-10-04 22:18:53 +02:00
florian
01a449c807 + debug msg added 2021-10-04 22:11:08 +02:00