Commit Graph

253 Commits

Author SHA1 Message Date
Rika Ichinose
94a1f33f60 Shorten i386 and x86-64 atomic implementations to offset the LoC cost of the previous commit. 2024-12-19 19:42:25 +00:00
Rika Ichinose
bb43afd26d Add more specialized atomics for i386 and x86-64. 2024-12-19 19:42:25 +00:00
Sven/Sarah Barth
e94d02a067 * with all existing RTLs switched over to the atomic intrinsics, the define FPC_SYSTEM_INTERLOCKED_USE_INTRIN can be removed again 2024-12-12 22:05:20 +01:00
Sven/Sarah Barth
ba7e87aff3 * switch x86_64 RTL to provide the atomic intrinsics instead of Interlocked* functions 2024-12-12 22:05:16 +01:00
florian
e471c08cf8 + SHA512Support 2024-12-07 11:10:34 +01:00
florian
73e96f8f1e * simplify SysResetFPU 2024-12-06 21:21:02 +01:00
florian
ccae78f97a + RiscV64: apply OptPass1OP also to addiw 2024-11-13 22:56:13 +01:00
florian
54dcfa78f8 * cleanup 2024-10-26 20:32:14 +02:00
Rika Ichinose
aed4292017 SSE set operations (i386). 2024-10-26 15:48:17 +00:00
Alligator-1
00d5351b55 partial revert 2024-08-26 20:20:57 +00:00
Alligator-1
8c3829e698 nostackframe 2024-08-26 13:02:45 +00:00
Rika Ichinose
d7352e7b66 Remove most of the VER3_0 conditionals. 2024-08-25 09:44:11 +00:00
Rika Ichinose
ca0e04a346 Faster path for IndexBytes with a match at the beginning. 2024-08-19 20:15:54 +00:00
Rika Ichinose
1030f67fb4 Use IndexQWord_SSE41 directly if -Cp RTL compiled with supports SSE 4.1. 2024-07-21 08:40:12 +00:00
Rika Ichinose
8bf2dc3f2b Simplify CPU units (70 LoC + 500 b code + 500 b data). 2024-07-18 20:13:11 +00:00
Rika Ichinose
a575a5c0fd Move Int128Rec to System; remove i386 and x86_64 CPU unit dependency on SysUtils. 2024-07-15 13:31:20 +00:00
Rika Ichinose
0ca608243c SSE4.1 IndexQWord for i386 and x86-64. 2024-06-29 20:37:55 +00:00
florian
567187d4ba + TSCSupport 2024-06-29 22:32:36 +02:00
florian
a0cae50af6 * rtl part of #35433 2024-05-01 23:15:12 +02:00
Rika Ichinose
b87e22151a Use non-conservative Fill thresholds. 2024-04-22 19:37:36 +00:00
florian
11f076f0e7 + CMPXCHG16BSupport 2024-02-28 22:18:42 +01:00
Rika Ichinose
2d6294eb26 MovQ + Shr → PExtrW. 2024-02-18 21:37:39 +00:00
Rika Ichinose
c29dd86bb2 Remove runtime ABI adapter in x86_64.inc:IndexByte/Word, and save two jumps in the common case. 2024-02-11 15:05:03 +00:00
Rika Ichinose
7bf502ad40 Change Mov*DQ to Mov*PS; they are always equivalent because no operations but the memory transfers are performed, and 1 byte shorter each. 2024-02-10 22:47:40 +00:00
Rika Ichinose
12f18177ae Simplify x86_64.inc:Move non-temporal loops, and adjust thresholds for move distances considered too short for NT. 2024-02-10 22:47:40 +00:00
Rika Ichinose
0b5998ee8b Write two last values after 2× loops unconditionally instead of an extra check. 2024-02-10 22:47:40 +00:00
Rika Ichinose
e395166cb7 Check for Move overlaps in more obvious way (that also does no jumps in forward case). 2024-02-10 22:47:40 +00:00
Rika Ichinose
0d5f7fa66b Increase non-temporal i386 & x64 Fill* thresholds to 4 Mb. 2024-01-01 18:33:33 +00:00
Rika Ichinose
1ec0326995 REP STOS branch for x64 Fill* (only for System V ABI for now). 2023-11-26 15:06:59 +00:00
Rika Ichinose
a4c324ee23 Fill* for x64, physically sharing half of the code with FillChar. 2023-11-26 15:06:59 +00:00
Rika Ichinose
b468793c63 Index/Compare refined by hand instead of mostly being GCC output. 2023-11-21 22:32:16 +00:00
florian
b164817e18 * check also for XGETBV support, resolves problem reported by Pierre 2023-11-20 22:55:25 +01:00
florian
704ad21b23 + centralized cpu capability detection 2023-11-18 22:28:50 +01:00
Rika Ichinose
c07f36b30b Post-modern CompareByte for x86-64/SSE2. 2023-11-16 21:42:51 +00:00
Rika Ichinose
0bc1d8d446 Deny effective RTM support if CPUID bit RTM_ALWAYS_ABORT is set. 2023-11-01 17:10:14 +00:00
Rika Ichinose
e00ab51185 On i386 and x86_64, add cpu.CPUID — high-level wrapper to CPUID instruction, and cpu.CPUBrandString — convenience for CPUID leaves 80000002, 80000003, and 80000004. 2023-10-31 21:20:45 +03:00
Rika Ichinose
0e426db5de x86_64.inc: shorten Interlocked*, perform macro-fused test+jz in Index* early. 2023-10-25 21:05:21 +00:00
Rika Ichinose
2dca69f2ac Specialized fpc_varset_OP_sets for i386 and x86-64. 2023-08-30 19:38:33 +00:00
Michael VAN CANNEYT
ccfa38c68e * Dotted RTL compiles 2023-07-27 19:04:03 +02:00
Michael VAN CANNEYT
5ce739135b * Char -> AnsiChar 2023-07-14 17:26:10 +02:00
Rika Ichinose
669d41172c Fix UTF-8 symbols in comments. 2023-07-08 21:18:55 +00:00
Rika Ichinose
8d5d7b480d Supposedly faster Move for x64. 2023-07-08 21:18:55 +00:00
Rika Ichinose
f20c7b9ae9 Shorter x86_64.inc:inc/declocked. 2023-06-14 21:19:11 +00:00
Rika Ichinose
b56cbad50e Supposedly faster FillChar for x64. 2023-04-13 15:55:42 +00:00
Rika Ichinose
8e884d9acd Handle Index* / Compare* tail by directly reading last VECSIZE bytes, if there was at least one full vector. 2023-04-03 20:08:56 +00:00
florian
ee16fc7b96 * patch by Rika, trivial adjustments to !373, resolves #40172 2023-02-27 22:07:06 +01:00
Rika Ichinose
da12cfc867 Improved CompareWord for i386 and x86_64. 2023-02-25 22:52:38 +00:00
florian
7cc94fc000 * patch by Rika: Trivial adjustments to !379, resolves #40168 2023-02-23 22:46:05 +01:00
Rika Ichinose
b723178117 Even better CompareByte for x64.
Tries to handle tails with a SIMD unit as well.
2023-02-19 18:12:37 +00:00
Rika Ichinose
d36e96ea74 Improved CompareDWord for i386 and x86_64. 2023-02-19 18:07:46 +00:00