Rika Ichinose
|
94a1f33f60
|
Shorten i386 and x86-64 atomic implementations to offset the LoC cost of the previous commit.
|
2024-12-19 19:42:25 +00:00 |
|
Rika Ichinose
|
bb43afd26d
|
Add more specialized atomics for i386 and x86-64.
|
2024-12-19 19:42:25 +00:00 |
|
Sven/Sarah Barth
|
e94d02a067
|
* with all existing RTLs switched over to the atomic intrinsics, the define FPC_SYSTEM_INTERLOCKED_USE_INTRIN can be removed again
|
2024-12-12 22:05:20 +01:00 |
|
Sven/Sarah Barth
|
ba7e87aff3
|
* switch x86_64 RTL to provide the atomic intrinsics instead of Interlocked* functions
|
2024-12-12 22:05:16 +01:00 |
|
florian
|
e471c08cf8
|
+ SHA512Support
|
2024-12-07 11:10:34 +01:00 |
|
florian
|
73e96f8f1e
|
* simplify SysResetFPU
|
2024-12-06 21:21:02 +01:00 |
|
florian
|
ccae78f97a
|
+ RiscV64: apply OptPass1OP also to addiw
|
2024-11-13 22:56:13 +01:00 |
|
florian
|
54dcfa78f8
|
* cleanup
|
2024-10-26 20:32:14 +02:00 |
|
Rika Ichinose
|
aed4292017
|
SSE set operations (i386).
|
2024-10-26 15:48:17 +00:00 |
|
Alligator-1
|
00d5351b55
|
partial revert
|
2024-08-26 20:20:57 +00:00 |
|
Alligator-1
|
8c3829e698
|
nostackframe
|
2024-08-26 13:02:45 +00:00 |
|
Rika Ichinose
|
d7352e7b66
|
Remove most of the VER3_0 conditionals.
|
2024-08-25 09:44:11 +00:00 |
|
Rika Ichinose
|
ca0e04a346
|
Faster path for IndexBytes with a match at the beginning.
|
2024-08-19 20:15:54 +00:00 |
|
Rika Ichinose
|
1030f67fb4
|
Use IndexQWord_SSE41 directly if -Cp RTL compiled with supports SSE 4.1.
|
2024-07-21 08:40:12 +00:00 |
|
Rika Ichinose
|
8bf2dc3f2b
|
Simplify CPU units (70 LoC + 500 b code + 500 b data).
|
2024-07-18 20:13:11 +00:00 |
|
Rika Ichinose
|
a575a5c0fd
|
Move Int128Rec to System; remove i386 and x86_64 CPU unit dependency on SysUtils.
|
2024-07-15 13:31:20 +00:00 |
|
Rika Ichinose
|
0ca608243c
|
SSE4.1 IndexQWord for i386 and x86-64.
|
2024-06-29 20:37:55 +00:00 |
|
florian
|
567187d4ba
|
+ TSCSupport
|
2024-06-29 22:32:36 +02:00 |
|
florian
|
a0cae50af6
|
* rtl part of #35433
|
2024-05-01 23:15:12 +02:00 |
|
Rika Ichinose
|
b87e22151a
|
Use non-conservative Fill thresholds.
|
2024-04-22 19:37:36 +00:00 |
|
florian
|
11f076f0e7
|
+ CMPXCHG16BSupport
|
2024-02-28 22:18:42 +01:00 |
|
Rika Ichinose
|
2d6294eb26
|
MovQ + Shr → PExtrW.
|
2024-02-18 21:37:39 +00:00 |
|
Rika Ichinose
|
c29dd86bb2
|
Remove runtime ABI adapter in x86_64.inc:IndexByte/Word, and save two jumps in the common case.
|
2024-02-11 15:05:03 +00:00 |
|
Rika Ichinose
|
7bf502ad40
|
Change Mov*DQ to Mov*PS; they are always equivalent because no operations but the memory transfers are performed, and 1 byte shorter each.
|
2024-02-10 22:47:40 +00:00 |
|
Rika Ichinose
|
12f18177ae
|
Simplify x86_64.inc:Move non-temporal loops, and adjust thresholds for move distances considered too short for NT.
|
2024-02-10 22:47:40 +00:00 |
|
Rika Ichinose
|
0b5998ee8b
|
Write two last values after 2× loops unconditionally instead of an extra check.
|
2024-02-10 22:47:40 +00:00 |
|
Rika Ichinose
|
e395166cb7
|
Check for Move overlaps in more obvious way (that also does no jumps in forward case).
|
2024-02-10 22:47:40 +00:00 |
|
Rika Ichinose
|
0d5f7fa66b
|
Increase non-temporal i386 & x64 Fill* thresholds to 4 Mb.
|
2024-01-01 18:33:33 +00:00 |
|
Rika Ichinose
|
1ec0326995
|
REP STOS branch for x64 Fill* (only for System V ABI for now).
|
2023-11-26 15:06:59 +00:00 |
|
Rika Ichinose
|
a4c324ee23
|
Fill* for x64, physically sharing half of the code with FillChar.
|
2023-11-26 15:06:59 +00:00 |
|
Rika Ichinose
|
b468793c63
|
Index/Compare refined by hand instead of mostly being GCC output.
|
2023-11-21 22:32:16 +00:00 |
|
florian
|
b164817e18
|
* check also for XGETBV support, resolves problem reported by Pierre
|
2023-11-20 22:55:25 +01:00 |
|
florian
|
704ad21b23
|
+ centralized cpu capability detection
|
2023-11-18 22:28:50 +01:00 |
|
Rika Ichinose
|
c07f36b30b
|
Post-modern CompareByte for x86-64/SSE2.
|
2023-11-16 21:42:51 +00:00 |
|
Rika Ichinose
|
0bc1d8d446
|
Deny effective RTM support if CPUID bit RTM_ALWAYS_ABORT is set.
|
2023-11-01 17:10:14 +00:00 |
|
Rika Ichinose
|
e00ab51185
|
On i386 and x86_64, add cpu.CPUID — high-level wrapper to CPUID instruction, and cpu.CPUBrandString — convenience for CPUID leaves 80000002, 80000003, and 80000004.
|
2023-10-31 21:20:45 +03:00 |
|
Rika Ichinose
|
0e426db5de
|
x86_64.inc: shorten Interlocked*, perform macro-fused test+jz in Index* early.
|
2023-10-25 21:05:21 +00:00 |
|
Rika Ichinose
|
2dca69f2ac
|
Specialized fpc_varset_OP_sets for i386 and x86-64.
|
2023-08-30 19:38:33 +00:00 |
|
Michael VAN CANNEYT
|
ccfa38c68e
|
* Dotted RTL compiles
|
2023-07-27 19:04:03 +02:00 |
|
Michael VAN CANNEYT
|
5ce739135b
|
* Char -> AnsiChar
|
2023-07-14 17:26:10 +02:00 |
|
Rika Ichinose
|
669d41172c
|
Fix UTF-8 symbols in comments.
|
2023-07-08 21:18:55 +00:00 |
|
Rika Ichinose
|
8d5d7b480d
|
Supposedly faster Move for x64.
|
2023-07-08 21:18:55 +00:00 |
|
Rika Ichinose
|
f20c7b9ae9
|
Shorter x86_64.inc:inc/declocked.
|
2023-06-14 21:19:11 +00:00 |
|
Rika Ichinose
|
b56cbad50e
|
Supposedly faster FillChar for x64.
|
2023-04-13 15:55:42 +00:00 |
|
Rika Ichinose
|
8e884d9acd
|
Handle Index* / Compare* tail by directly reading last VECSIZE bytes, if there was at least one full vector.
|
2023-04-03 20:08:56 +00:00 |
|
florian
|
ee16fc7b96
|
* patch by Rika, trivial adjustments to !373, resolves #40172
|
2023-02-27 22:07:06 +01:00 |
|
Rika Ichinose
|
da12cfc867
|
Improved CompareWord for i386 and x86_64.
|
2023-02-25 22:52:38 +00:00 |
|
florian
|
7cc94fc000
|
* patch by Rika: Trivial adjustments to !379, resolves #40168
|
2023-02-23 22:46:05 +01:00 |
|
Rika Ichinose
|
b723178117
|
Even better CompareByte for x64.
Tries to handle tails with a SIMD unit as well.
|
2023-02-19 18:12:37 +00:00 |
|
Rika Ichinose
|
d36e96ea74
|
Improved CompareDWord for i386 and x86_64.
|
2023-02-19 18:07:46 +00:00 |
|