Rika Ichinose
ce6db34224
Shortcut Compare*(a, a) before entering the aligned loop.
2025-03-29 22:07:03 +01:00
Rika Ichinose
ff2492edf5
Add System.UMul64x64_128.
2025-03-15 22:18:55 +01:00
Rika Ichinose
4f92679625
BMI1 → BMI2.
2025-03-13 01:02:15 +03:00
Rika Ichinose
900b1fc4ec
Check for refcount = 1 first.
2025-02-16 15:17:48 +03:00
Rika Ichinose
6ccad3dc4e
Shortcut declocked on refcount = 1.
2025-01-31 22:03:25 +00:00
Rika Ichinose
94a1f33f60
Shorten i386 and x86-64 atomic implementations to offset the LoC cost of the previous commit.
2024-12-19 19:42:25 +00:00
Rika Ichinose
bb43afd26d
Add more specialized atomics for i386 and x86-64.
2024-12-19 19:42:25 +00:00
Sven/Sarah Barth
e94d02a067
* with all existing RTLs switched over to the atomic intrinsics, the define FPC_SYSTEM_INTERLOCKED_USE_INTRIN can be removed again
2024-12-12 22:05:20 +01:00
Sven/Sarah Barth
295d3f0969
* switch i386 RTL to provide the atomic intrinsics instead of Interlocked* functions
2024-12-12 22:05:16 +01:00
florian
e471c08cf8
+ SHA512Support
2024-12-07 11:10:34 +01:00
Rika Ichinose
d1db5d2104
Darwin: re-enable new assembler fill*word variants
...
Work around with an extra jump to an extra function.
2024-11-23 19:06:47 +03:00
Jonas Maebe
28e9ebc7da
Darwin: disable new assembler fill*word variants
...
They use interprocedural gotos at the assembler level, which is incompatible
with auto-generated CFI
2024-11-20 21:39:19 +01:00
Rika Ichinose
6e655eb5a3
Remove fpc_varset_* indirections if SSE support is guaranteed.
2024-10-27 08:16:25 +00:00
florian
54dcfa78f8
* cleanup
2024-10-26 20:32:14 +02:00
Rika Ichinose
aed4292017
SSE set operations (i386).
2024-10-26 15:48:17 +00:00
Rika Ichinose
9917350ef0
AVX2 CompareByte for i386.
2024-09-23 20:10:57 +00:00
Rika Ichinose
fc1050a834
Make use of CPUX86_HINT_BSX_DEST_UNCHANGED_ON_ZF_1 in Bsf*/Bsr*.
2024-09-22 08:33:44 +00:00
Rika Ichinose
d7352e7b66
Remove most of the VER3_0 conditionals.
2024-08-25 09:44:11 +00:00
Rika Ichinose
ea33fdcdf8
Decimate rtl/i386/strings.inc.
2024-08-19 20:34:10 +00:00
Rika Ichinose
ca0e04a346
Faster path for IndexBytes with a match at the beginning.
2024-08-19 20:15:54 +00:00
Rika Ichinose
1030f67fb4
Use IndexQWord_SSE41 directly if -Cp RTL compiled with supports SSE 4.1.
2024-07-21 08:40:12 +00:00
Rika Ichinose
8bf2dc3f2b
Simplify CPU units (70 LoC + 500 b code + 500 b data).
2024-07-18 20:13:11 +00:00
Rika Ichinose
a575a5c0fd
Move Int128Rec to System; remove i386 and x86_64 CPU unit dependency on SysUtils.
2024-07-15 13:31:20 +00:00
Rika Ichinose
73bf0c82bb
Disable _Plain versions when compiling RTL for newer CPUs.
2024-07-14 14:36:17 +00:00
florian
9d957cd6b3
* fix TSC support bit as mentioned by Rika
2024-07-01 22:26:03 +02:00
florian
bdef7af09e
* corrected rte number after last merge
2024-06-30 15:25:08 +02:00
Rika Ichinose
ea271c1088
Make int64 division helpers “nostackframe”.
2024-06-30 12:51:40 +00:00
Rika Ichinose
0ca608243c
SSE4.1 IndexQWord for i386 and x86-64.
2024-06-29 20:37:55 +00:00
florian
567187d4ba
+ TSCSupport
2024-06-29 22:32:36 +02:00
florian
a0cae50af6
* rtl part of #35433
2024-05-01 23:15:12 +02:00
Rika Ichinose
0655b342d4
Shorter IndexByte_Plain.
2024-04-26 18:35:47 +00:00
Rika Ichinose
b87e22151a
Use non-conservative Fill thresholds.
2024-04-22 19:37:36 +00:00
Rika Ichinose
bad42011ab
Better i386.inc:fpc_ansistr_unique.
2024-04-08 18:47:18 +00:00
Rika Ichinose
e42209457e
Shorter i386 Exp().
2024-03-29 21:04:32 +00:00
florian
22e9033076
+ MMXSupport added to cpu unit
...
* mmx unit makes more use of cpu unit
2024-03-28 10:42:08 +01:00
Rika Ichinose
6c8acf28cd
Shorten MMX unit.
2024-03-28 09:13:43 +00:00
Rika Ichinose
a35577593b
Don’t misalign FillChar pattern.
2024-03-10 21:24:21 +00:00
Rika Ichinose
e87e14c7cc
Make some i386.inc functions “nostackframe”.
2024-02-21 21:08:36 +00:00
Rika Ichinose
e5b47310c8
Supposedly faster i386 int() and frac().
2024-02-18 21:37:39 +00:00
Rika Ichinose
0b5998ee8b
Write two last values after 2× loops unconditionally instead of an extra check.
2024-02-10 22:47:40 +00:00
Rika Ichinose
e395166cb7
Check for Move overlaps in more obvious way (that also does no jumps in forward case).
2024-02-10 22:47:40 +00:00
Rika Ichinose
e4a0b1adb4
Use ERMS in all eligible cases, again.
...
Namely, when Move.count > NtThreshold but move distance is too short. 8310b169b7
messed with the logic and made this case fall back to a regular loop instead of more preferable ERMS.
2024-01-11 21:51:44 +00:00
Rika Ichinose
35345fe145
Fix FillQWord_SSE2 stack usage.
2024-01-06 21:18:56 +00:00
Rika Ichinose
9d8b801e4c
Improve i386 fpc_shortstr_to_shortstr(), fpc_shortstr_compare(), and add fpc_shortstr_compare_equal().
2024-01-01 21:12:52 +00:00
Rika Ichinose
0d5f7fa66b
Increase non-temporal i386 & x64 Fill* thresholds to 4 Mb.
2024-01-01 18:33:33 +00:00
Rika Ichinose
b7d32e4933
ERMSB-aware Fill* for i386.
2024-01-01 18:33:33 +00:00
Rika Ichinose
8310b169b7
Move ERMS branch into a separate function instead of runtime checks of fast_large_repmovstosb.
2023-12-31 09:54:09 +00:00
Rika Ichinose
f14aced9c5
Attempt to ERMS backward i386 ‘Move’s.
2023-12-31 09:54:09 +00:00
Rika Ichinose
ecc56d7e68
Attempt to save push/pop ebx on small non-GPR moves.
2023-12-10 13:26:39 +00:00
Rika Ichinose
0750777fc8
Supposedly better fastmove.inc.
2023-12-10 13:26:39 +00:00