Commit Graph

830 Commits

Author SHA1 Message Date
Rika Ichinose
c433b740e4 Add/actualize CPUX86_HAS_SSSE3. 2025-03-05 22:35:01 +01:00
J. Gareth "Curious Kit" Moreton
5536810075 * x86: Fixed bug where "aoc_ForceNewIteration" wouldn't update the registers properly in some circumstances 2025-03-02 14:00:57 +00:00
florian
9355e703d7 * change some getglobaldatalabel into getlocaldatalabel to simplify code if pic is used 2025-01-13 22:34:31 +01:00
florian
60690e379e * typo fixed 2024-12-08 11:14:37 +01:00
florian
5b54ab2040 + zen5 architecture for completeness 2024-12-07 22:23:35 +01:00
User Muller
f689746372 cmpxchg16b instruction uses no size suffix in ATT syntax 2024-12-06 14:45:44 +01:00
J. Gareth "Curious Kit" Moreton
8520dabebb * x86: New RET/lbl/RET optimisation 2024-10-27 08:17:10 +00:00
Rika Ichinose
1030f67fb4 Use IndexQWord_SSE41 directly if -Cp RTL compiled with supports SSE 4.1. 2024-07-21 08:40:12 +00:00
florian
db5e821ead * more change information updates 2024-07-02 22:21:24 +02:00
florian
9ce7fbeef0 * change information updates 2024-06-28 22:45:00 +02:00
florian
2fe3955be9 + more change information 2024-06-11 23:22:21 +02:00
florian
bdb611c925 * small fix of change information 2024-06-10 23:10:48 +02:00
florian
c64fae2f89 * missing AVX-2 change information fixed 2024-06-03 23:29:37 +02:00
florian
73a251410e + more change information 2024-06-02 23:11:14 +02:00
florian
860c32f833 + set CPUX86_HINT_BSX_DEST_UNCHANGED_ON_ZF_1 for suitable CPUs 2024-06-01 19:41:46 +02:00
J. Gareth "Curious Kit" Moreton
2de19f9e66 * x86: Reimplemented TAsmNode XML dumping using new framework 2024-05-30 20:04:11 +00:00
florian
4eb8f8e565 * patch by Marģers
- Rename 3DNow instruction (fixed long lasting typo in mnemonic). PMULHRWA  --> PMULHRW
    - Add vpclmullqlqdq, vpclmulhqlqdq, vpclmullqhqdq, vpclmulhqhqdq.
    - Fix "typo" for SHA1MSG2
2024-05-30 21:49:30 +02:00
florian
53459fed2b + CPUX86_HINT_BSX_DEST_UNCHANGED_ON_ZF_1 2024-05-24 23:17:07 +02:00
florian
b826ad8b7e + CPUX86_HINT_FAST_SHORT_REP_MOVS
* use FPC_MOVE instead of rep movs if possible, partially fixes 
2024-05-16 22:59:21 +02:00
florian
1de3aba4e3 * few types fixed 2024-04-26 22:54:27 +02:00
J. Gareth "Curious Kit" Moreton
11b341cc97 * x86: Added new OptPass1CMOVcc peephole optimisation routine to dust up min/max code 2024-03-26 14:18:31 +00:00
J. Gareth "Curious Kit" Moreton
a7fe49f38f * x86: CMOVcc/Jcc pairs are now changed to MOV/Jcc if the register is not used if the jump doesn't branch 2024-03-10 21:09:59 +00:00
florian
cad21584e5 + Skylake-X 2024-02-25 22:52:30 +01:00
florian
587af1c78e * icelake is x86-64-v4 2024-02-24 21:54:11 +01:00
florian
37ed03667f * fixed fpu_x86_64_v4_flags 2024-02-23 21:51:37 +01:00
J. Gareth "Curious Kit" Moreton
629c87efc8 * x86-64: Typo fixed in FPU type string array 2024-02-13 14:39:29 +00:00
J. Gareth "Curious Kit" Moreton
62495c964a * x86: New "aoc_DoPass2JccOpts" option to catch branch and
STC/CLC optimisations that only manifest in Pass 2
2024-02-11 15:05:57 +00:00
J. Gareth "Curious Kit" Moreton
3e06242fd8 * x86: New "STC/CLC; MOV" peephole optimisation 2024-02-11 15:05:57 +00:00
J. Gareth "Curious Kit" Moreton
b4eabbe5ce * x86: Fixed CPU feature flags for AMD Jaguar and Piledriver 2024-02-06 22:15:51 +00:00
J. Gareth "Curious Kit" Moreton
e4bd58d66a * x86: Replaced CPU features array with "cpu_x86_64_v1_flags" where possible 2024-02-06 22:15:51 +00:00
florian
f80f1112d4 + Zen 4 2024-02-05 23:18:07 +01:00
florian
ac6dc582be + also add x86-64 as cpu type (gcc compatibility) 2024-02-03 22:40:54 +01:00
florian
f8dbb09a46 * fixed some issues with the x86-64 instruction versions
* use more of the constants
2024-02-01 22:08:27 +01:00
florian
ae465fa8dc + introduce x86-64 microarchitecture levels for cpu and fpu flags 2024-01-31 22:32:57 +01:00
florian
ff3b4adc27 + more CPU and FPU flags added 2024-01-30 22:55:42 +01:00
J. Gareth "Curious Kit" Moreton
b514e979bd * Fixed issue where OptPass2CMP and OptPass2TEST didn't drop out on labels etc. 2024-01-27 19:00:50 +00:00
florian
c4fc5fc916 * disable OptPass2Test and OptPass2CMP for now as it seems to result in buggy code 2024-01-23 22:11:59 +01:00
J. Gareth "Curious Kit" Moreton
63879e74cd * x86: Additional TEST/CMP optimisations to optimise CMOV blocks
that aren't optimal due to register pressure
2024-01-07 14:29:42 +00:00
Jonas Maebe
0ca260e08c LLVM: fix currency parameters passed on the stack on x86-64
Resolves 
2023-11-05 11:30:19 +01:00
J. Gareth "Curious Kit" Moreton
47825610b8 * Pass 2 can now be run multiple times when
under -O3 and above.
2023-11-05 10:03:52 +00:00
J. Gareth "Curious Kit" Moreton
ede47ffea9 * New "fast 3-component LEA hint" and "Icelake" CPU options 2023-10-29 10:26:52 +00:00
Sven Barth
82dd70e72f * fix parameter alignment on x86_64 when more than 6 parameters are involved (aka the stack is used)
+ added test
2023-08-03 22:34:28 +02:00
J. Gareth "Curious Kit" Moreton
b8933dd267 * x86: Some refactoring to use aoc_ForceNewIteration instead of manually advancing p 2023-03-04 18:40:27 +00:00
florian
9c10167b6f + CPUX86_HAS_BSWAP 2023-02-04 19:20:10 +01:00
florian
14466ee9d9 * change table updates 2022-12-06 22:41:30 +01:00
florian
8ad7decaa3 * another change information update 2022-12-04 23:17:56 +01:00
florian
42d91c02bd * continued to fix change information 2022-12-03 23:36:07 +01:00
florian
e0eff8bd89 + more change information fixed 2022-12-02 23:34:36 +01:00
J. Gareth "Curious Kit" Moreton
170c112301 * x86: Added FMA as an FPU target distinct from AVX and AVX2 (the latter of which has a new FPUX86_HAS_AVX2 flag) 2022-11-25 22:14:59 +00:00
J. Gareth "Curious Kit" Moreton
69c7838571 * x86: Addition of AMD CPUs: Bobcat, Jaguar, Piledriver, Excavator, Zen2 and Zen3 (and supporting flags) 2022-11-25 22:14:59 +00:00