J. Gareth "Curious Kit" Moreton
33cf86ff9f
PostPeepholeOptTestOr now removes TEST when dealing with POPCNT and LZCNT
2022-01-06 20:57:48 +00:00
J. Gareth "Curious Kit" Moreton
116c861af6
MOV/CMP optimisation is now in both Pass 1 and Pass 2 to catch more
...
eventualities
2021-12-31 14:28:35 +00:00
J. Gareth "Curious Kit" Moreton
8609c0803e
Fixed MovxOp2Op failing on i386 due to lack of register check
2021-12-26 16:20:18 +00:00
J. Gareth "Curious Kit" Moreton
f289f2694a
x86: Additions to OptPass2Movx to better synergise with new CMP optimisation under -O2
2021-12-25 19:07:48 +00:00
J. Gareth "Curious Kit" Moreton
683a92bcc8
i386: Correction to GetIntRegisterBetween to ensure we only get 8-bit registers that we can actually encode
2021-12-25 19:07:48 +00:00
J. Gareth "Curious Kit" Moreton
1da7ce46de
x86: New double CMP optimisation to remove a branch
2021-12-25 19:07:48 +00:00
J. Gareth "Curious Kit" Moreton
cafd708b6d
Refactoring of OptPass2Movx to remove goto
2021-12-25 16:38:10 +00:00
J. Gareth "Curious Kit" Moreton
22cd8d5d62
Fixed bug in MovxMovx2Movx optimisation that would specify a 64-bit destination instead of 32-bit one
2021-12-25 14:49:08 +00:00
J. Gareth "Curious Kit" Moreton
b4c8c1da12
Overflow bug fixes to MovZX/SX optimisations when CMP instructions are encountered.
2021-12-23 07:14:49 +00:00
florian
6dbe71cd30
* TX86AsmOptimizer.OptPass1MOVXX should search only over other instructions if it works with registers only
2021-12-22 22:54:11 +01:00
florian
6147d6d8a0
* compilation with i386 fixed
2021-12-21 22:46:12 +01:00
J. Gareth "Curious Kit" Moreton
d083cc7247
New MovxAndTest2Test optimisation to mirror the regular MovAndTest2Test optimisation
2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
5b4c104aaf
Massive overhaul to OptPass2Movx to favour operand shrinkage
2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
d255ffba8b
Improved handling of signed sequences in OptPass2Movx
2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
01e5f4855a
MovZX->MovSX optimisation
2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
4825d2d16c
New Movz ###,%ecx, shift/rotate %cl,... optimisation
2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
f02b7508de
Bolder OptPass2Movx optimisations, including a simplification fix
2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
da899df6b2
New MovxMovxOp2OpMovx optimisation
2021-12-20 22:10:22 +00:00
J. Gareth "Curious Kit" Moreton
40196f4a43
Fixes to ADD/SUB 128 optimisation that didn't check flags properly, and also handling ADC/SBB properly
2021-12-19 20:51:57 +00:00
J. Gareth "Curious Kit" Moreton
b4bd15a5c0
Removed incorrect logic in TEST optimisation
2021-12-17 22:10:12 +00:00
J. Gareth "Curious Kit" Moreton
be448e29f6
Fixed bug in new TEST optimisation where a FLAGS check always returned "in use"
2021-12-15 20:14:26 +00:00
J. Gareth "Curious Kit" Moreton
42c429bf45
New optimisation that merges small constants written to the stack
2021-12-15 19:47:50 +00:00
J. Gareth "Curious Kit" Moreton
7a15312b54
Safety checks on TEST removals and better FLAG tracking
2021-12-13 16:11:33 +00:00
J. Gareth "Kit" Moreton
f60523a3b9
x86: New TEST optimisations
2021-12-12 21:40:42 +00:00
Yuriy Sydorov
7b2cd0bcdc
* Prevent a range check error in case of big unsigned values.
2021-12-12 17:55:46 +02:00
J. Gareth "Curious Kit" Moreton
2dc0995067
- Bug fix to new ADD/SUB optimisation where conditions are concerned
...
- Register allocation fixes for overflow checks
2021-11-17 20:18:57 +00:00
J. Gareth "Curious Kit" Moreton
9f60628e5b
x86: new optimisation to change add/sub 128,(dest) to sub/add -128,(dest) to reduce binary size
2021-11-14 21:38:38 +00:00
Pierre Muller
8e7791ac23
Explicitly disable overflow for offset propagation optimization
2021-11-08 22:55:44 +00:00
florian
7fcbd1d7e0
* my last commit hopefully fixed
2021-11-07 14:58:17 +01:00
florian
492d75483d
* fix (V)Cvtss2CvtSd(V)Cvtsd2ss2* optmizations for non-avx code, resolves #39416
2021-11-07 14:46:13 +01:00
florian
44051b4af3
* corrected accidently made changs in 01a449c8, resolves #39424
2021-11-03 22:41:07 +01:00
J. Gareth "Curious Kit" Moreton
284317d877
Fixed OptPass2Lea not honouring symbols
2021-10-31 15:44:00 +00:00
J. Gareth "Curious Kit" Moreton
42eb06f5c6
Fixed some range check problems
2021-10-31 15:44:00 +00:00
J. Gareth "Curious Kit" Moreton
b58fdc3e58
Improved ADD and SUB optimisations for LEA instructions
2021-10-31 15:44:00 +00:00
florian
10fcae34a9
* improved TX86AsmOptimizer.OptPass1MOVXX
2021-10-24 18:38:23 +02:00
florian
4610980f2e
* TX86AsmOptimizer.OptPass1MOVXX takes care of volatility
2021-10-23 23:40:09 +02:00
J. Gareth "Curious Kit" Moreton
342803532d
Bug fix to MovMov2Mov 6 optimisation exposed by 4012c3dbd4 (and miscellaneous code refactors)
2021-10-22 22:39:46 +00:00
florian
ea6529ff63
* manually merged merge request 69 by J. Gareth "Kit" Moreton:
...
x86: CMP/MOV refactoring and expansion
This merge request refactors the SwapMovCmp routine, and calls to it, to be more self-contained,
having the preliminary checks built-in to ensure that moving the MOV instruction is
actually a sound idea, while also making it more general-purpose so it can handle instructions
that are not MOV operations. This feature is primarily for future expansion,
but also cleans up the code for the x86 peephole optimizer.
2021-10-17 10:22:30 +02:00
florian
4012c3dbd4
* merge request 75 by J. Gareth "Kit" Moreton manually applied:
...
This merge request makes a number of improvements to the DeepMOVOpt method and supporting functions:
* ReplaceRegisterInInstruction now replaces registers in references that are written to
(since the registers themselves won't change)
* RegModifiedByInstruction will no longer return True for a register that appears in a reference
that's written to (for the same reason as above) - special operations like MOVSS
(the 0-operand version) aren't affected.
* DeepMOVOpt returning True will now always set the Result of OptPass1MOV to True even though p
wasn't directly modified, since this often caused missed optimisations.
* Some of the speed-ups in the patch from #32916 have also been applied in order to make
the general DeepMOVOpt run faster, notably it tries to avoid calling UpdateUsedRegs where possible.
2021-10-17 09:50:47 +02:00
J. Gareth "Curious Kit" Moreton
fd28cc0db0
Better handling of zeroing upper parts of registers
...
Better handling of zeroing upper parts of registers
2021-10-16 14:42:19 +02:00
J. Gareth "Curious Kit" Moreton
674ed4069a
Expanded MM block move to include YMM registers under AVX
2021-10-16 14:17:41 +02:00
florian
d55b2c2a35
+ extend assembler optimization MovxMov2Mov to MovxOp2Op
2021-10-15 23:12:59 +02:00
florian
07413be8b5
+ being able to define change information for xmm0
...
* corrected change information for SHA256RNDS2
2021-10-10 23:07:23 +02:00
J. Gareth "Curious Kit" Moreton
a925522ead
xor optimisation now doesn't check to see if the REX prefix will actually be removed, as it's beneficial for speed reasons to only use the 32-bit register when zeroing the whole thing
2021-10-10 16:17:43 +00:00
florian
2c180cf101
* by default, DEBUG_AOPTCPU is only enabled if the compiler is compiled with -dEXTDEBUG
2021-10-10 15:35:38 +02:00
florian
b4bf371b34
* generate VMOVAPS for (V)Cvtss2CvtSd(V)Cvtsd2ss optimization, resolves #39360
2021-10-08 22:59:29 +02:00
florian
4752230c8f
* use source register as second register in VCVTSD2SS and VCVTSS2SD, this should break
...
dependency chains better and resolves partially #39360
2021-10-07 23:16:39 +02:00
florian
ec40db3da7
+ (V)Cvtss2CvtSd(V)Cvtsd2ss2Nop optimization, resolves #39360
2021-10-06 21:57:24 +02:00
florian
1e136b0cc7
* bail out early in MatchInstruction
2021-10-04 22:18:53 +02:00
florian
01a449c807
+ debug msg added
2021-10-04 22:11:08 +02:00