Commit Graph

193 Commits

Author SHA1 Message Date
florian
df0201799e o patch by Nico Erfurth: Support Assembly optimized functions of SwapEndian on ARM
Currently the ARM-Port uses generic functions for SwapEndian, which are
relativly slow.

This patch adds optimized functions for the 32 and 64-bit case, the 16
bit case is still handled with a normal function, while the generated
code is far from optimal, the inlining (which is not possible with
asm-functions) makes it faster than the optimized function.

Some Numbers from my 1.2GHz Kirkwood (ARMv5):

                        Old     New     Result
SwapEndian(Integer)     12.168s 5.411s  44.47%
SwapEndian(Int64)       168.28s 9.015s   5.36%

Testcode was
begin
        I := $FFFFFFF;
        while I > 0 do
        begin
                Val2 := MySwapEndian(Val);
                Dec(I);
        end;
end.

Currently only the ARM implementation is tested. ARMv6+ includes a rev
instruction, while I've implemented them, I was not able to test them.

git-svn-id: trunk@20685 -
2012-04-01 17:31:49 +00:00
Jonas Maebe
bba4b02eb2 * use r7 instead of r11 as frame pointer on Darwin/iOS, and make sure r7
always points to the previous r7 on the stack (with the saved return
    address coming right after it) so that the debugger and crashreporter
    can use it for backtraces as specified in the ABI
   o changed NR_FRAME_POINTER_REG and RS_FRAME_POINTER_REG from a symbolic
     into a typed constant, and added a new method to tprocinfo that can
     be used to initialze it (so it can be inited to r7/r11 depending on
     the target platform)
  * allow using r9 on Darwin, it was only used by the system on iOS up to
    2.x, which we no longer support
  * prefer using r9 and r12 before r4..r11 on Darwin, because they are
    volatile and hence do not have to be saved

git-svn-id: trunk@20661 -
2012-03-29 20:54:33 +00:00
Jonas Maebe
6ba8dc7146 + support for the ARM hard float EABI on Linux (patch by Peter Green):
o new eabihf (hard float) abi
   o vfpv3_d16 variant of VFP (default variant used by EABI assemblers: VFPv3
     with only 16 double registers instead of 32) and pass it to GNU as
   o make the odd numbered single precision floating point VFP registers
     available for explicit allocation for use by the calling convention
  * fixed copy/paste error in stdname of S30 register
  -> use -dFPC_ARMHF to create an ARM eabi hard float compiler
  (mantis #21554)

git-svn-id: trunk@20660 -
2012-03-29 20:50:09 +00:00
florian
e9c5458dd2 o patch by Nico Erfurth:
* Fix for InterLockedCompareExchange on ARMEL

InterLockedCompareExchange would not return the current data on failure.
Getting this to work correctly is a bit tricky. As kuser_cmpxchg does
not return the set value, we have to load it.
There is a tiny chance that we get rescheduled between calling
kuser_cmpxchg and loading the value. If the value changed in between
there is the possibility that we would return the Comperand without
having done an actual swap. Which might cause havoc and destruction.

So, if the exchange failed, compare the value and loop again in case
of CurrentValue == Comperand.

* Improve testing of InterLockedCompareExchange

Added a test to check for the case when Comperand is different from the
current value.

git-svn-id: trunk@20514 -
2012-03-11 21:08:57 +00:00
florian
891d7b9349 * comitted wrong patch in r20491, fixed with this revision
git-svn-id: trunk@20510 -
2012-03-11 07:38:21 +00:00
florian
18866623cd o patch by Nico Erfurth: Optimize some ARM-RTL functions
Use "nostackframe" for:
  - Sptr (broken without nostackframe)
  - get_caller_addr
  - get_caller_frame

Use cmp+ldrne instead of movs+beq+ldr, its a bit more pipeline-friendly
and takes burden of the BPU.

git-svn-id: trunk@20506 -
2012-03-10 21:52:06 +00:00
florian
5b03826549 o patch by Nico Erfurth: Better Locked* implementation for arm on linux
The following functions where changed to make use of the kernel helper
kuser_cmpxchg:
InterLockedDecrement
InterLockedIncrement
InterLockedExchangeAdd
InterLockedCompareExchange

The previous implementation using a spinlock had a couple of drawbacks:
1.) The functions could not be used safely on values not completly managed
by the process itself, because the spinlock did not protect data but the
functions. For example, think about two processes using shared memory.
They would not be able to share fpc_system_lock, making it unsafe to use
these functions.
2.) With many active threads, there was a high chance that the scheduler
would interrupt a thread while fpc_system_lock was taken, which would
result in the following threads using one of these functions to spinlock till
the end of its timeslice. This could result in unwanted and unnecessary
latencies.
3.) Every function contained a pointer to fpc_system_lock. Resulting in
two polluted DCache-Lines per call and possible latencies through dcache
misses.

The new implementation only works on Linux Kernel >= 2.6.16
The functions are implemented in a way which tries to minimize cache pollution
and load latencies.

Even without Multithreading the new functions are a lot faster. I've did
comparisons on my Kirkwood 1.2GHz with the following template code:

var X: longint;
begin
	X := 0;
	while X < longint(100*1000000) do
		FUNCTION(X);
	Writeln(X);
end.

Function                     New        Old
InterLockedIncrement:        0m3.696s   0m23.220s
InterLockedExchangeAdd:      0m4.034s   0m23.242s
InterLockedCompareExchange:  0m4.703s   0m24.006s

This speedup is most probably because of the reduced memory access,
which resulted in lots of cache misses.

git-svn-id: trunk@20491 -
2012-03-10 11:33:20 +00:00
florian
5fa184c952 + patch by Jeppe Johansen to make use of the div/udiv instruction on arm7m, resolves #20022
* explicitly make symbol addressing PC relative

git-svn-id: trunk@19221 -
2011-09-24 21:41:01 +00:00
sergei
4ebc34c5e7 * Promoted result type of FPC_PCHAR_LENGTH and FPC_PWIDECHAR_LENGTH to SizeInt.
+ Check for nil pointer in FPC_PWIDECHAR_LENGTH

git-svn-id: trunk@17733 -
2011-06-13 04:59:17 +00:00
florian
8bff2a0de4 * patch by Jeppe Johansen to fix thumb2 epilog generation, resolves #18392
git-svn-id: trunk@17252 -
2011-04-05 19:25:20 +00:00
florian
0e74cea8ed * patch by Simon Ley to improve move on arm: unneeded plds are removed, resolves #19050
git-svn-id: trunk@17251 -
2011-04-05 18:44:10 +00:00
Jonas Maebe
780e75bfac o patch by Jeppe Johansen to fix mantis #17472:
* generate add.w instead of add for thumb-2 in case one of the registers
      is > r8
    * add register interferences for the "add" instruction so the register
      allocator can detect invalid instruction forms (even for assembler code)
    * fixed error in thumb2.inc detected by the previous change

git-svn-id: trunk@16633 -
2010-12-24 15:54:39 +00:00
Jonas Maebe
c14574bb56 * don't change the fpu control word in the initialisation code of dynamic
libraries (mantis #16263, #16801)

git-svn-id: trunk@16347 -
2010-11-14 16:00:25 +00:00
florian
24fea58b92 + initial implementation of iso style gotos in iso mode
* made setjmp/longjmp accessible to the compiler by compiler proc, they are used by the iso goto code

git-svn-id: trunk@15711 -
2010-08-05 19:20:46 +00:00
florian
3aa1315c06 * thumb2 opcode fixes by Jeppe Johansen, resolves #16306
git-svn-id: trunk@15154 -
2010-04-21 17:40:35 +00:00
Jonas Maebe
fbebd87593 * use BLX instead of "mov r14, r15; mov r15, reg" for a_call_reg on ARMv6
and above, so this also works when calling thumb code (should actually
    also be done for ARMv5T, but we don't have a monicker for that yet)
  * use BX instead of "mov r15, r14" for simple returns from subroutines
    on ARMv6+ to support returning to thumb code from ARM code (idem)

git-svn-id: trunk@14332 -
2009-12-04 22:38:50 +00:00
Jonas Maebe
91fc26a530 * the bits in the VFP fpscr don't mask exceptions, but enable them
(was used correctly in fpu init code in arm.inc, but inverted in
     setexcetionmask logic)

git-svn-id: trunk@14328 -
2009-12-04 19:54:35 +00:00
Jonas Maebe
d1538ab023 o added ARM VPFv2/VFPv3 support:
+ RTL support:
      o VFP exceptions are disabled by default on Darwin,
        because they cause kernel panics on iPhoneOS 2.2.1 at least
      o all denormals are truncated to 0 on Darwin, because disabling
        that also causes kernel panics on iPhoneOS 2.2.1 (probably
        because otherwise denormals can also cause exceptions)
    * set softfloat rounding mode correctly for non-wince/darwin/vfp
      targets
    + compiler support: only half the number of single precision
      registers is available due to limitations of the register
      allocator
    + added a number of comments about why the stackframe on ARM is
      set up the way it is by the compiler
    + added regtype and subregtype info to regsets, because they're
      also used for VFP registers (+ support in assembler reader)
    + various generic support routines for dealing with floating point
      values located in integer registers that have to be transferred to
      mm registers (needed for VFP)
    * renamed use_sse() to use_vectorfpu() and also use it for
      ARM/vfp support
    o only superficially tested for Linux (compiler compiled with -Cpvfpv6
      -Cfvfpv2 works on a Cortex-A8, no testsuite run performed -- at least
      the fpu exception handler still needs to be implemented), Darwin has
      been tested more thoroughly
  + added ARMv6 cpu type and made it default for Darwin/ARM
  + ARMv6+ implementations of atomic operations using ldrex/strex
  * don't use r9 on Darwin/ARM, as it's reserved under certain
    circumstances (don't know yet which ones)
  * changed C-test object files for ARM/Darwin to ARMv6 versions
  * check in assembler reader that regsets are not empty, because
    instructions with a regset operand have undefined behaviour in that
    case
  * fixed resultdef of tarmtypeconvnode.first_int_to_real in case of
    int64->single type conversion
  * fixed constant pool locations in case 64 bit constants are generated,
    and/or when vfp instructions with limited reach are present

  WARNING: when using VFP on an ARMv6 or later cpu, you *must* compile all
    code with -Cparmv6 (or higher), or you will get crashes. The reason is
    that storing/restoring multiple VFP registers must happen using
    different instructions on pre/post-ARMv6.

git-svn-id: trunk@14317 -
2009-12-03 22:46:30 +00:00
florian
515774b864 * merged armthum branch
-- Zusammenführen der Unterschiede zwischen Projektarchiv-URLs in ».«:
U    rtl/arm/setjump.inc
A    rtl/arm/thumb2.inc
U    rtl/arm/divide.inc
A    rtl/embedded/arm/stm32f103.pp
U    rtl/inc/system.inc
U    compiler/alpha/cgcpu.pas
U    compiler/sparc/cgcpu.pas
U    compiler/i386/cgcpu.pas
U    compiler/ncgld.pas
U    compiler/powerpc/cgcpu.pas
U    compiler/avr/cgcpu.pas
U    compiler/aggas.pas
U    compiler/powerpc64/cgcpu.pas
U    compiler/x86_64/cgcpu.pas
U    compiler/cgobj.pas
U    compiler/psystem.pas
U    compiler/aasmtai.pas
U    compiler/m68k/cgcpu.pas
U    compiler/ncgutil.pas
U    compiler/rautils.pas
U    compiler/arm/raarmgas.pas
U    compiler/arm/armatts.inc
U    compiler/arm/cgcpu.pas
U    compiler/arm/armins.dat
U    compiler/arm/rgcpu.pas
U    compiler/arm/cpubase.pas
U    compiler/arm/agarmgas.pas
U    compiler/arm/cpuinfo.pas
U    compiler/arm/armop.inc
U    compiler/arm/narmadd.pas
U    compiler/arm/aoptcpu.pas
U    compiler/arm/armatt.inc
U    compiler/arm/aasmcpu.pas
U    compiler/systems/t_embed.pas
U    compiler/psub.pas
U    compiler/options.pas

git-svn-id: trunk@13801 -
2009-10-04 09:03:44 +00:00
Jonas Maebe
22aacd2a60 * return 0 for length(pchar(0)), like Kylix does (using corrected and
multi-platform version of patch in r12461, which caused the i386 version
    of fpc_pchar_length to return 0 in all cases, which used tabs, and did
    not include a test case)

git-svn-id: trunk@12464 -
2009-01-01 22:02:17 +00:00
florian
6dcdf5bdf4 * tabs/spaces fixed
git-svn-id: trunk@12015 -
2008-11-02 09:41:30 +00:00
Jonas Maebe
30a51c2dee + support for the different rounding modes in the generic rounding
routines (mantis #11392)

git-svn-id: trunk@11290 -
2008-06-27 17:20:56 +00:00
yury
20a12503b8 * Fixed fpc_shortstr_to_shortstr for arm.
git-svn-id: trunk@10651 -
2008-04-13 16:17:14 +00:00
yury
3dc94e678d * Fixed fpc_shortstr_assign for arm.
git-svn-id: trunk@10635 -
2008-04-12 15:58:35 +00:00
yury
5dc6e54925 * Removed inline for procedures with assembler or formal parameters, since inline is not supported for them (compiler warns about that now). Even if there is no inline modifier in interface declaration of procedure, it is possible to specify inline in procedure implementation if needed (e.g. for generic implementations) and inlining will work for them.
git-svn-id: trunk@10629 -
2008-04-12 11:37:49 +00:00
micha
4a7f6bccf9 * fix arm edsp test to load from aligned address
git-svn-id: trunk@10487 -
2008-03-13 21:36:01 +00:00
micha
89e9d4ab17 * fix int64 multiplication on armeb
git-svn-id: trunk@10461 -
2008-03-08 13:02:51 +00:00
florian
c544d97de9 * fix edsp instructions detection
git-svn-id: trunk@10458 -
2008-03-07 21:51:14 +00:00
daniel
d8bffd27fc - Intergrate i386/strlen.inc and remove it.
+ int_str assembler implementations for i386
 + fpc_shortstr_to_shortstr assembler implementation for ARM
 + fpc_shortstr_assign assembler implementation for ARM
 + fpc_Pchar_length assembler implementation for ARM

git-svn-id: trunk@9582 -
2007-12-30 11:19:10 +00:00
daniel
68731ae067 + Assembler implementation of mod/div.
Improves amount of divides from about 230000/s to about 2400000/s on
    ARM920T, 200MHz.

git-svn-id: trunk@9543 -
2007-12-27 17:59:45 +00:00
yury
1ea7d58a61 * Fixed arm-linux build.
git-svn-id: trunk@9055 -
2007-11-02 09:32:05 +00:00
yury
e62c6cfcc4 * Fixed warnings and notes.
git-svn-id: trunk@9041 -
2007-11-01 14:16:43 +00:00
yury
c85f6fb53b * Fixed access to stack parameter in fpc_mul_qword for arm.
git-svn-id: trunk@9030 -
2007-11-01 10:34:27 +00:00
yury
986396545d * Fixed register saving in fpc_mul_qword for arm. It fixed bug #10017.
* Removed unneeded register lists for some pure asm routines for arm.

git-svn-id: trunk@9019 -
2007-10-31 23:11:50 +00:00
florian
7da7364ee7 * refactored SysResetFPU into SysInitFPU and SysResetFPU
git-svn-id: trunk@8966 -
2007-10-28 12:06:49 +00:00
florian
76b95fb058 * fixed arm-linux compilation with FPC_USE_LIBC
git-svn-id: trunk@8809 -
2007-10-14 21:05:44 +00:00
yury
ef3178cdb1 * Fixed default float exceptions mask for arm fpu. It fixes tw3160c.pp on arm-linux.
git-svn-id: trunk@8054 -
2007-07-14 19:46:22 +00:00
Jonas Maebe
7d44ca0113 * fixed unportable soft float mask handling which broke on big endian
systems after yesterday's set changes

git-svn-id: trunk@7402 -
2007-05-20 10:25:48 +00:00
florian
d78071f8b2 * ensure that softfloat and libgcc float never use rfs/wfs
git-svn-id: trunk@7229 -
2007-05-01 11:47:19 +00:00
florian
2085635fe7 * load moveproc with default value
git-svn-id: trunk@6803 -
2007-03-12 19:31:52 +00:00
florian
fd6fdfe896 * check used fpu type properly
git-svn-id: trunk@6786 -
2007-03-11 17:17:06 +00:00
florian
b5b86f6d73 * ce compilation fixed
git-svn-id: trunk@6457 -
2007-02-12 18:39:39 +00:00
florian
31c9a91af0 + edsp detection for arm-linux
git-svn-id: trunk@6429 -
2007-02-11 16:21:04 +00:00
florian
57415a73a7 + assembler coded move for arm
git-svn-id: trunk@6412 -
2007-02-11 11:04:55 +00:00
florian
1ab81c7eb6 * fixed fpa flag setting
git-svn-id: trunk@6154 -
2007-01-23 22:11:54 +00:00
florian
83a0391c24 * gba and nds have no softfloat support
git-svn-id: trunk@6090 -
2007-01-20 20:41:04 +00:00
yury
458abdef3e * implemented SysResetFPU for arm-wince.
* set softfloat_exception_mask in SetExceptionMask for ARM.

git-svn-id: trunk@6035 -
2007-01-17 23:58:19 +00:00
florian
075011a2a5 * fpa exception masking fixed
git-svn-id: trunk@6026 -
2007-01-17 15:52:36 +00:00
yury
d401c0a198 * activated internal get_frame for ARM.
git-svn-id: trunk@5945 -
2007-01-13 15:23:51 +00:00
yury
68a71f4ca3 * fixed SetPrecisionMode/GetPrecisionMode for wince.
git-svn-id: trunk@5673 -
2006-12-22 00:49:17 +00:00
Legolas
9e6d19a494 * rtl part of first Nintendo DS port
git-svn-id: trunk@5593 -
2006-12-14 17:34:51 +00:00
florian
69ae03d6bc * fixed wrong operands of swp
git-svn-id: trunk@5072 -
2006-10-29 20:51:31 +00:00
florian
90e481ef13 * fixed arm-linux compilation
git-svn-id: trunk@4645 -
2006-09-18 19:47:52 +00:00
yury
ba21edb0fd * Implemented inclocked and declocked for arm.
git-svn-id: trunk@4534 -
2006-09-02 09:38:18 +00:00
yury
e1b9814b5d * fixed some warnings and notes while compiling RTL.
git-svn-id: trunk@4256 -
2006-07-19 10:31:15 +00:00
yury
11576fd24b * fixed warnings and notes while compiling system unit for wince.
git-svn-id: trunk@4250 -
2006-07-18 15:00:09 +00:00
yury
a083f5754e * implemented exceptions, rounding, precision control for arm-wince math.
git-svn-id: trunk@4104 -
2006-07-06 18:56:36 +00:00
yury
d7cbde6f25 * Assembler Interlocked* functions for ARM.
git-svn-id: trunk@4011 -
2006-06-30 15:36:49 +00:00
oro06
3afad32966 *arm: TPECoffLinker is TInternalLinkerWin
+arm : InterlockedCompareExchangePointer

git-svn-id: trunk@3993 -
2006-06-29 07:39:54 +00:00
peter
4c065bce45 * move InterLocked functions to system unit
git-svn-id: trunk@3933 -
2006-06-25 09:26:23 +00:00
florian
5575a837db * gba patch from Francesco Lombardi
git-svn-id: trunk@3716 -
2006-05-28 14:48:24 +00:00
yury
4b8ac056da * ifdef for WinCE was added.
git-svn-id: trunk@1215 -
2005-09-28 06:44:56 +00:00
florian
8adc1c9b0c + RTL part of WinCE patches from Yuri Sidorov
git-svn-id: trunk@572 -
2005-07-03 15:52:27 +00:00
michael
3a2eaa94b1 + Removed INTERNCONSTINTF define
git-svn-id: trunk@267 -
2005-06-07 22:04:18 +00:00
michael
93ba0409be + Removed HASCOMPILERPROC define
git-svn-id: trunk@265 -
2005-06-07 21:41:02 +00:00
peter
4ace790492 * remove $Log
git-svn-id: trunk@231 -
2005-06-07 09:47:55 +00:00
fpc
790a4fe2d3 * log and id tags removed
git-svn-id: trunk@42 -
2005-05-21 09:42:41 +00:00
fpc
50778076c3 initial import
git-svn-id: trunk@1 -
2005-05-16 18:37:41 +00:00
florian
069a5206e1 * move draft 2005-03-13 10:04:52 +00:00
peter
e417e34496 * truncate log 2005-02-14 17:13:06 +00:00
florian
264270bd96 * arctan, sin and cos are done in software on the arm 2005-01-06 13:02:03 +00:00
florian
0bc92dfa09 + added nostackframe directive to get_frame 2005-01-05 15:59:02 +00:00
florian
6333a6a6b3 * fillchar fixed; it's used now 2005-01-05 15:21:14 +00:00
florian
28a1c72885 + correct setting of FPU exception mask 2005-01-04 16:46:38 +00:00
florian
1033fb1430 + added nostackframe directive 2005-01-04 16:22:05 +00:00
florian
5974694623 * fixed overflow checking for qword*qword 2005-01-04 12:57:52 +00:00
florian
47521fde82 * fixed building 2005-01-01 18:34:24 +00:00
florian
ddb6d0d595 + assembler implementation of fpc_mul_qword
* fpu exceptions are now generated
2004-03-23 21:03:10 +00:00
florian
f5c99d9e2d * setjmp fixed 2004-03-23 19:13:09 +00:00
florian
5074b9a1a8 * disabled internal ln 2004-03-16 22:02:26 +00:00
florian
bca9da0ec7 * draft for qword mul 2004-03-14 21:45:11 +00:00
marco
e546db7a23 * interlocked* changed to longints, including winapi. (which was a bug) 2004-03-05 12:17:50 +00:00
florian
3e274eaa0f * some math nodes are inlined now 2004-01-27 15:04:49 +00:00
florian
7b5dc40284 * compilation on arm fixed 2004-01-26 11:48:24 +00:00
florian
a6589cbab1 + get_caller_addr/frame implemented 2004-01-21 23:12:07 +00:00
florian
1883a09ddd * fixed setjump
* fixed syscalls
2004-01-20 21:01:57 +00:00
peter
d11cecb354 * removed assembler
* cleanup
2003-12-24 22:27:13 +00:00
florian
cd88850377 * fixed some arm stuff 2003-11-30 19:48:20 +00:00
florian
b9376da0aa * some arm issues fixed 2003-11-21 00:40:06 +00:00
florian
eb8f265588 * initial revision 2003-11-03 17:28:21 +00:00
florian
8d771df2d4 * arm fixes to the common rtl code
* some generic math code fixed
  * ...
2003-09-03 14:09:37 +00:00
florian
736ae20a79 * empty dummy files
+ [long|set]jmp implemented
2003-08-21 16:41:54 +00:00
florian
454fa4f40c + basic makefile.cpu added 2003-08-21 03:24:43 +00:00