+ support for nested procedural variables:
o activate using {$modeswitch nestedprocvars} (compatible with all
regular syntax modes, enabled by default for MacPas mode)
o activating this mode switch changes the way the frame pointer is
passed to nested routines into the same way that Delphi uses (always
passed via the stack, and if necessary removed from the stack by
the caller) -- Todo: possibly also allow using this parameter
passing convention without enabling nested procvars, maybe even
by default in Delphi mode, see mantis #9432
o both global and nested routines can be passed to/assigned to a
nested procvar (and called via them). Note that converting global
*procvars* to nested procvars is intentionally not supported, so
that this functionality can also be implemented via compile-time
generated trampolines if necessary (e.g. for LLVM or CIL backends
as long as they don't support the aforementioned parameter passing
convention)
o a nested procvar can both be declared using a Mac/ISO Pascal style
"inline" type declaration as a parameter type, or as a stand-alone
type (in the latter case, add "is nested" at the end in analogy to
"of object" for method pointers -- note that using variables of
such a type is dangerous, because if you call them once the enclosing
stack frame no longer exists on the stack, the results are
undefined; this is however allowed for Metaware Pascal compatibility)
git-svn-id: trunk@15694 -
represent complex locations (required for full x86-64 ABI support,
which is not yet implemented) -> lots of special result handling
code has been removed and replaced by the parameter handling
routines
+ added support for composite parameters (and hence function
results) to tcg.a_load_ref_cgpara() (so it can be used for
handling, e.g., 64 bit parameters on 32 bit platforms)
* the above fixed writing past the end of allocated memory when
handling records returned in registers on x86-64 whose size is
not a multiple of 8 bytes (mantis #16357)
- removed the x86-64 and PPC specific versions of a_load_ref_cgpara(),
as they are now handled correctly by the generic version
* moved the responsibility of allocating tcgpara cpu registers
(using paramanager.allocparaloc()) from the callers of
cg.a_load*_cgpara() to the cg.a_load*_cgpara() methods
themselves (so the register allocation can be done efficiently
when dealing with function results)
* for the above, renamed paramanager.alloc/freeparaloc() to
paramanager.alloc/freecgpara(), and use paramanager.allocparaloc()
to allocate individual pcgparalocations instead
* fixed the register size of SSE2 function result registers for
x86-64 (when used for floating point), which results in removing
a few superfluous "movs? %xmm0,%xmm0" instructions
* fixed compilation of paramanagers of avr, m68k and mips after r13695
and also updated them for these new changes
git-svn-id: trunk@15350 -
tobjectdef.needs_inittable returns false for classes nowadays (and already
since quite some time)
* nevertheless replaced all usages in the compiler of x.needs_inittable with
is_managed_type(x) (in case some other condition is added again in the
future) and removed all remaining accompanying "and not is_class(x)"
checks
git-svn-id: trunk@15320 -
ncgutil it can be used elsewhere too
- removed the code that checks for 64 bit integer types in the float
para loading code, since is_64bit() can never return true for a
floatdef
git-svn-id: trunk@15318 -
+ RTL support:
o VFP exceptions are disabled by default on Darwin,
because they cause kernel panics on iPhoneOS 2.2.1 at least
o all denormals are truncated to 0 on Darwin, because disabling
that also causes kernel panics on iPhoneOS 2.2.1 (probably
because otherwise denormals can also cause exceptions)
* set softfloat rounding mode correctly for non-wince/darwin/vfp
targets
+ compiler support: only half the number of single precision
registers is available due to limitations of the register
allocator
+ added a number of comments about why the stackframe on ARM is
set up the way it is by the compiler
+ added regtype and subregtype info to regsets, because they're
also used for VFP registers (+ support in assembler reader)
+ various generic support routines for dealing with floating point
values located in integer registers that have to be transferred to
mm registers (needed for VFP)
* renamed use_sse() to use_vectorfpu() and also use it for
ARM/vfp support
o only superficially tested for Linux (compiler compiled with -Cpvfpv6
-Cfvfpv2 works on a Cortex-A8, no testsuite run performed -- at least
the fpu exception handler still needs to be implemented), Darwin has
been tested more thoroughly
+ added ARMv6 cpu type and made it default for Darwin/ARM
+ ARMv6+ implementations of atomic operations using ldrex/strex
* don't use r9 on Darwin/ARM, as it's reserved under certain
circumstances (don't know yet which ones)
* changed C-test object files for ARM/Darwin to ARMv6 versions
* check in assembler reader that regsets are not empty, because
instructions with a regset operand have undefined behaviour in that
case
* fixed resultdef of tarmtypeconvnode.first_int_to_real in case of
int64->single type conversion
* fixed constant pool locations in case 64 bit constants are generated,
and/or when vfp instructions with limited reach are present
WARNING: when using VFP on an ARMv6 or later cpu, you *must* compile all
code with -Cparmv6 (or higher), or you will get crashes. The reason is
that storing/restoring multiple VFP registers must happen using
different instructions on pre/post-ARMv6.
git-svn-id: trunk@14317 -
to objc_msgSend* into the callnode. This allows reusing the current
call node rather than having to create a new one, and is in particular
necessary because even though the objc_msgSend* functions are declared
as varargs, you're supposed to typecast them to the function type
describing the method before calling them (so they should *not* use
varargs calling conventions!)
* for the above, a field called fobjcforcedprocname has been added to the
callnode, which can be set to a string that will be used as the (mangled)
name of the function to call instead of the mangled name of the procsym
-> fixes calling obj-c methods with floating point arguments on ppc
git-svn-id: branches/objc@13783 -
forced to something else by the compiler (internal rtl functions etc),
necessary for the objc branch
* fixed adding all used function result registers to the list of
registers that may need to be saved before a function call
git-svn-id: trunk@13695 -
physical registers
* free the physical return registers at the caller side for 64 bit
systems
* make sure that we do not double-free registers in case a return
value is not used (mantis #13536)
git-svn-id: trunk@13023 -
alignment for each memory reference (mantis #12137, and
test/packages/fcl-registry/tregistry1.pp on sparc). This also
enables better code generation for packed records in many cases.
o several changes were made to the compiler to minimise the chances
of accidentally forgetting to set the alignment of memory references
in the future:
- reference_reset*() now has an extra alignment parameter
- location_reset() can now only be used for non LOC_(C)REFERENCE,
use location_reset_ref() for those (split the tloc enum so the
compiler can catch errors using range checking)
git-svn-id: trunk@12719 -
svn+ssh://jonas@svn.freepascal.org/FPC/svn/fpc/branches/wpo
........
r11878 | jonas | 2008-10-11 02:25:18 +0200 (Sat, 11 Oct 2008) | 19 lines
+ initial implementation of whole-program optimisation framework
+ implementation of whole-program devirtualisation
o use:
a) generate whole-program optimisation information (no need
to completely compile the program and all of its units
with -OW/-FW, only the main program is sufficient)
fpc -OWdevirtcalls -FWmyprog.wpo myprog
b) use it to optimise the program
fpc -B -Owdevirtcalls -Fwmyprog.wpo myprog
(the -B is not required, but only sources recompiled during
the second pass will actually be optimised -- if you want,
you can even rebuild the rtl devirtualised for a particular
program; and these options can obviously also be used
together with regular optimisation switches)
o warning:
- there are no checks yet to ensure that you do not use
units optimised for a particular program with another
program (or with a changed version of the same program)
........
r11881 | jonas | 2008-10-11 19:35:52 +0200 (Sat, 11 Oct 2008) | 13 lines
* extracted code to detect constructed class/object types from
tcallnode.gen_vmt_tree into its own method to avoid clutter
* detect x.classtype.create constructs (with classtype = the
system.tobject.classtype method), and treat them as if a
"class of x" has been instantiated rather than a
"class of tobject". this required storing the instantiated
classrefs in their own array though, because at such a
point we don't have a "class of x" tdef available (so
now "x", and all other defs instantiated via a classref,
are now stored as tobjectdefs in a separate array)
+ support for devirtualising class methods (including
constructors)
........
r11882 | jonas | 2008-10-11 20:44:02 +0200 (Sat, 11 Oct 2008) | 7 lines
+ -Owoptvmts whole program optimisation which replaces vmt entries
with method names of child classes in case the current class'
method can never be called (e.g., because this class is never
instantiated). As a result, such methods can then be removed
by dead code removal/smart linking (not much effect for either
the compiler, lazarus or a trivial lazarus app though).
........
r11889 | jonas | 2008-10-12 14:29:54 +0200 (Sun, 12 Oct 2008) | 2 lines
* some comment fixes
........
r11891 | jonas | 2008-10-12 18:49:13 +0200 (Sun, 12 Oct 2008) | 4 lines
* fixed twpofilereader.getnextnoncommentline() when reusing a previously
read line
* fixed skipping of unnecessary wpo feedback file sections
........
r11892 | jonas | 2008-10-12 23:42:43 +0200 (Sun, 12 Oct 2008) | 31 lines
+ symbol liveness wpo information extracted from smartlinked programs
(-OW/-Owsymbolliveness)
+ use symbol liveness information to improve devirtualisation (don't
consider classes created in code that has been dead code stripped).
This requires at least two passes of using wpo (first uses dead code
info to locate classes that are constructed only in dead code,
second pass uses this info to potentially further devirtualise).
I.e.:
1) generate initial liveness and devirtualisation feedback
fpc -FWtt.wpo -OWall tt.pp -Xs- -CX -XX
2) use previously generated feedback, and regenerate new feedback
based on this (i.e., disregard classes created in dead code)
fpc -FWtt-1.wpo -OWall -Fwtt.wo -Owall tt.pp -Xs- -CX -XX
3) use the newly generated feedback (in theory, it is possible
that even more opportunities pop up afterwards; you can
continue until the program does not get smaller anymore)
fpc -Fwtt-1.wpo -Owall tt.pp -CX -XX
* changed all message() to cgmessage() calls so the set codegenerror
* changed static fsectionhandlers field to a regular field called
fwpocomponents
* changed registration of wpocomponents: no longer happens in the
initialization section of their unit, but in the InitWpo routine
(which has been moved from the woinfo to the wpo unit). This way
you can register different classes based on the target/parameters.
+ added static method to twpocomponentbase for checking whether
the command line parameters don't conflict with the requested
optimisations (e.g. generating liveness info requires that
smartlinking is turned on)
+ added static method to twpocomponentbase to request the
section name
........
r11893 | jonas | 2008-10-12 23:53:57 +0200 (Sun, 12 Oct 2008) | 3 lines
* fixed comment error (twpodeadcodeinfo keeps a list of live,
not dead symbols)
........
r11895 | jonas | 2008-10-13 00:13:59 +0200 (Mon, 13 Oct 2008) | 2 lines
+ documented -OW<x>, -Ow<x>, -FW<x> and -Fw<x> wpo parameters
........
r11899 | jonas | 2008-10-14 22:14:56 +0200 (Tue, 14 Oct 2008) | 2 lines
* replaced hardcoded string with objdumpsearchstr constant
........
r11900 | jonas | 2008-10-14 22:15:25 +0200 (Tue, 14 Oct 2008) | 2 lines
* reset wpofeedbackinput and wpofeedbackoutput in wpodone
........
r11901 | jonas | 2008-10-14 22:16:07 +0200 (Tue, 14 Oct 2008) | 2 lines
* various additional comments and comment fixes
........
r11902 | jonas | 2008-10-15 18:09:42 +0200 (Wed, 15 Oct 2008) | 5 lines
* store vmt procdefs in the ppu files so we don't have to use a hack to
regenerate them for whole-program optimisation
* fixed crash when performing devirtualisation optimisation on programs
that do not construct any classes/objects with optimisable vmts
........
r11935 | jonas | 2008-10-19 12:24:26 +0200 (Sun, 19 Oct 2008) | 4 lines
* set the vmt entries of non-class virtual methods of not instantiated
objects/classes to FPC_ABSTRACTERROR so the code they refer to can
be thrown away if it is not referred to in any other way either
........
r11938 | jonas | 2008-10-19 20:55:02 +0200 (Sun, 19 Oct 2008) | 7 lines
* record all classrefdefs/objdefs for which a loadvmtaddrnode is generated,
and instead of marking all classes that derive from instantiated
classrefdefs as instantiated, only mark those classes from the above
collection that derive from instantiated classrefdefs as
instantiated (since to instantiate a class, you have to load its vmt
somehow -- this may be broken by using assembler code though)
........
r12212 | jonas | 2008-11-23 12:26:34 +0100 (Sun, 23 Nov 2008) | 3 lines
* fixed to work with the new vmtentries that are always available and
removed previously added code to save/load vmtentries to ppu files
........
r12304 | jonas | 2008-12-05 22:23:30 +0100 (Fri, 05 Dec 2008) | 4 lines
* check whether the correct wpo feedback file is used in the current
compilation when using units that were compiled using wpo information
during a previous compilation run
........
r12308 | jonas | 2008-12-06 18:03:39 +0100 (Sat, 06 Dec 2008) | 2 lines
* abort compilation if an error occurred during wpo initialisation
........
r12309 | jonas | 2008-12-06 18:04:28 +0100 (Sat, 06 Dec 2008) | 3 lines
* give an error message instead of crashing with an io exception if the
compiler is unable to create the wpo feedback file specified using -FW
........
r12310 | jonas | 2008-12-06 18:12:43 +0100 (Sat, 06 Dec 2008) | 3 lines
* don't let the used wpo feedback file influence the interface crc (there's
a separate check for such changes)
........
r12316 | jonas | 2008-12-08 19:08:25 +0100 (Mon, 08 Dec 2008) | 3 lines
* document the format of the sections of the wpo feedback file inside the
feedback file itself
........
r12330 | jonas | 2008-12-10 22:26:47 +0100 (Wed, 10 Dec 2008) | 2 lines
* use sysutils instead of dos to avoid command line length limits
........
r12331 | jonas | 2008-12-10 22:31:11 +0100 (Wed, 10 Dec 2008) | 3 lines
+ support for testing whole program optimisation tests (multiple
compilations using successively generated feedback files)
........
r12332 | jonas | 2008-12-10 22:31:40 +0100 (Wed, 10 Dec 2008) | 2 lines
+ whole program optimisation tests
........
r12334 | jonas | 2008-12-10 22:38:07 +0100 (Wed, 10 Dec 2008) | 2 lines
- removed unused local variable
........
r12339 | jonas | 2008-12-11 18:06:36 +0100 (Thu, 11 Dec 2008) | 2 lines
+ comments for newly added fields to tobjectdef for devirtualisation
........
r12340 | jonas | 2008-12-11 18:10:01 +0100 (Thu, 11 Dec 2008) | 2 lines
* increase ppu version (was no longer different from trunk due to merging)
........
git-svn-id: trunk@12341 -
shortstring temps don't get maximum alignment)
* changed some gettemptyed() calls into gettemp() calls (gettemptyped
means that this temp can only be used for temps of that type,
which is necessary for refcounted types but not for floats)
git-svn-id: trunk@12036 -
the syntax is exactly the same as for "external", except for
the keyword. It is currently only active for Darwin targets.
It should also work at least for Linux targets, but only with
the GNU assembler (which is why it is not activated there)
+ test for this functionality
git-svn-id: trunk@12009 -
a) cpu64bitaddr, which means that we are generating a compiler which
will generate code for targets with a 64 bit address space/abi
b) cpu64bitalu, which means that we are generating a compiler which
will generate code for a cpu with support for 64 bit integer
operations (possibly running in a 32 bit address space, depending
on the cpu64bitaddr define)
All cpus which had cpu64bit set now have both the above defines set,
and none of the 32 bit cpus have cpu64bitalu set (and none will
compile with it currently)
+ pint and puint types, similar to aint/aword (not pword because that
that conflicts with pword=^word)
* several changes from aint/aword to pint/pword
* some changes of tcgsize2size[OS_INT] to sizeof(pint)
git-svn-id: trunk@10320 -
* rename methodpointerinit/done to callinitblock/callcleanupblock
* moved checks in callnode to separate functions
* funcretnode is now always a simple node instead of a block of
statements
* funcret and methodpointer are generated/optimized only in pass_1 so
a conversion from calln to loadn is much easier
* function result assignments are much more often optimized to use the
assignment destination location instead of using a temp
git-svn-id: trunk@8558 -
* fixed downsizing the precision of floating point values
* floating point constants are now treated using only the minimal
precision required (e.g. 2.0 is now a single, 1.1 extended etc)
(Delphi compatible)
git-svn-id: trunk@5927 -
* symtables based on TFPHashObjectList and TFPObjectList
* rename torddef.typ to torddef.ordtype
* rename tfloatdef.typ to tfloatdef.floattype
* rename tdef.deftype to tdef.typ
* remove obsolete browser code, browcol is kept so the ide
can still be compiled
git-svn-id: trunk@5192 -
+ use {$bitpacking on/+} to change the meaning of "packed"
into "bitpacked" for arrays. This is the default for MacPas.
You can also define individual arrays as "bitpacked", but
this is not encouraged since this keyword is not known by
other compilers and therefore makes your code unportable.
+ pack(unpackedarray,index,packedarray) to pack
length(packedarray) elements starting at
unpackedarray[index] into packedarray.
+ unpack(packedarray,unpackedarray,index) to unpack
packedarray into unpackedarray, with the first
element being stored at unpackedarray[index]
* todo:
* "open packed arrays" and rtti for packed arrays are not
yet supported
* gdb does not properly support bitpacked arrays
git-svn-id: trunk@4449 -