mirror of
https://gitlab.com/freepascal.org/fpc/source.git
synced 2025-08-14 08:09:18 +02:00
+ Added Carls patches.
This commit is contained in:
parent
08bb4c9d4b
commit
61285973f9
728
docs/prog.tex
728
docs/prog.tex
@ -31,8 +31,8 @@
|
|||||||
\begin{document}
|
\begin{document}
|
||||||
\title{Free Pascal \\ Programmers' manual}
|
\title{Free Pascal \\ Programmers' manual}
|
||||||
\docdescription{Programmers' manual for \fpc, version \fpcversion}
|
\docdescription{Programmers' manual for \fpc, version \fpcversion}
|
||||||
\docversion{1.3}
|
\docversion{1.4}
|
||||||
\date{March 1998}
|
\date{July 1998}
|
||||||
\author{Micha\"el Van Canneyt}
|
\author{Micha\"el Van Canneyt}
|
||||||
\maketitle
|
\maketitle
|
||||||
\tableofcontents
|
\tableofcontents
|
||||||
@ -45,7 +45,7 @@
|
|||||||
This is the programmer's manual for \fpc.
|
This is the programmer's manual for \fpc.
|
||||||
|
|
||||||
It describes some of the peculiarities of the \fpc compiler, and provides a
|
It describes some of the peculiarities of the \fpc compiler, and provides a
|
||||||
glimp of how the compiler generates its code, and how you can change the
|
glimpse of how the compiler generates its code, and how you can change the
|
||||||
generated code. It will not, however, provide you with a detailed account of
|
generated code. It will not, however, provide you with a detailed account of
|
||||||
the inner workings of the compiler, nor will it tell you how to use the
|
the inner workings of the compiler, nor will it tell you how to use the
|
||||||
compiler (described in the \userref). It also will not describe the inner
|
compiler (described in the \userref). It also will not describe the inner
|
||||||
@ -193,17 +193,17 @@ appears on your system.
|
|||||||
|
|
||||||
{\em Remark :} Take care that the object file you're linking is in a
|
{\em Remark :} Take care that the object file you're linking is in a
|
||||||
format the linker understands. Which format this is, depends on the platform
|
format the linker understands. Which format this is, depends on the platform
|
||||||
you're on. Typing \var{ld} on th command line gives a list of formats
|
you're on. Typing \var{ld} on the command line gives a list of formats
|
||||||
\var{ld} knows about.
|
\var{ld} knows about.
|
||||||
|
|
||||||
You can pass other files and options to the linker using the \var{-k}
|
You can pass other files and options to the linker using the \var{-k}
|
||||||
command-line option. You can specify more than one of these options, and
|
command-line option. You can specify more than one of these options, and
|
||||||
they
|
they will be passed to the linker, in the order that you specified them on
|
||||||
will be passed to the linker, in the order that you specified them on the
|
the command line, just before the names of the object files that must be
|
||||||
command line, just before the names of the object files that must be linked.
|
linked.
|
||||||
|
|
||||||
% Assembler type
|
% Assembler type
|
||||||
\subsection{\var{\$I386\_XXX} : Specify assembler format}
|
\subsection{\var{\$I386\_XXX} : Specify assembler format (Intel x86 only)}
|
||||||
This switch informs the compiler what kind of assembler it can expect in an
|
This switch informs the compiler what kind of assembler it can expect in an
|
||||||
\var{asm} block. The \var{XXX} should be replaced by one of the following:
|
\var{asm} block. The \var{XXX} should be replaced by one of the following:
|
||||||
\begin{description}
|
\begin{description}
|
||||||
@ -218,7 +218,7 @@ is compiled, unless they are replaced by another directive of the same type.
|
|||||||
The command-line switch that corresponds to this switch is \var{-R}.
|
The command-line switch that corresponds to this switch is \var{-R}.
|
||||||
|
|
||||||
|
|
||||||
\subsection{\var{\$MMX} : MMX support}
|
\subsection{\var{\$MMX} : MMX support (Intel x86 only)}
|
||||||
As of version 0.9.8, \fpc supports optimization for the \textbf{MMX} Intel
|
As of version 0.9.8, \fpc supports optimization for the \textbf{MMX} Intel
|
||||||
processor (see also \ref{ch:MMXSupport}). This optimizes certain code parts for the \textbf{MMX} Intel
|
processor (see also \ref{ch:MMXSupport}). This optimizes certain code parts for the \textbf{MMX} Intel
|
||||||
processor, thus greatly improving speed. The speed is noticed mostly when
|
processor, thus greatly improving speed. The speed is noticed mostly when
|
||||||
@ -267,7 +267,7 @@ generated. You can specify this switch \textbf{only} befor the \var{Program}
|
|||||||
or \var{Unit} clause in your source file. The different kinds of formats are
|
or \var{Unit} clause in your source file. The different kinds of formats are
|
||||||
shown in \seet{Formats}.
|
shown in \seet{Formats}.
|
||||||
|
|
||||||
\begin{FPCltable}{ll}{Formats generated by the compiler}{Formats} \hline
|
\begin{FPCltable}{ll}{Formats generated by the x86 compiler}{Formats} \hline
|
||||||
Switch value & Generated format \\ \hline
|
Switch value & Generated format \\ \hline
|
||||||
att & AT\&T assembler file. \\
|
att & AT\&T assembler file. \\
|
||||||
o & Unix object file.\\
|
o & Unix object file.\\
|
||||||
@ -315,23 +315,42 @@ the executable. The effect of this switch is the same as the command-line
|
|||||||
switch \var{-g}. By default, insertion of debugging information is off.
|
switch \var{-g}. By default, insertion of debugging information is off.
|
||||||
|
|
||||||
\subsection{\var{\$E} : Emulation of coprocessor}
|
\subsection{\var{\$E} : Emulation of coprocessor}
|
||||||
This directive controls the emulation of the coprocessor. On the i386
|
|
||||||
processor, it is supported for
|
|
||||||
compatibility with Turbo Pascal. The compiler itself doesn't do the emulation
|
|
||||||
of the coprocessor. Under \dos, the \dos extender does this, and under
|
|
||||||
\linux, the kernel takes care of the coprocessor support.
|
|
||||||
|
|
||||||
If you use the Motorola 680x0 version, then the switch is recognized, as
|
This directive controls the emulation of the coprocessor. There is no
|
||||||
there is no extender to emulate the coprocessor, so the compiler must do
|
command-line counterpart for this directive.
|
||||||
that by itself.
|
|
||||||
|
\subsubsection{ Intel x86 version }
|
||||||
|
|
||||||
|
When this switch is enabled, all floating point instructions
|
||||||
|
which are not supported by standard coprocessor emulators will give out
|
||||||
|
a warning.
|
||||||
|
|
||||||
|
The compiler itself doesn't do the emulation of the coprocessor.
|
||||||
|
|
||||||
|
To use coprocessor emulation under \dos go32v1 there is nothing special
|
||||||
|
required, as it is handled automatically.
|
||||||
|
|
||||||
|
To use coprocessor emulation under \dos go32v2 you must use the
|
||||||
|
emu387 unit, which contains correct initialization code for the
|
||||||
|
emulator.
|
||||||
|
|
||||||
|
Under \linux, the kernel takes care of the coprocessor support.
|
||||||
|
|
||||||
|
\subsubsection{ Motorola 680x0 version }
|
||||||
|
|
||||||
|
When the switch is on, no floating point opcodes are emitted
|
||||||
|
by the code generator. Instead, internal run-time library routines
|
||||||
|
are called to do the necessary calculations. In this case all
|
||||||
|
real types are mapped to the single IEEE floating point type.
|
||||||
|
|
||||||
|
\emph{ Remark : } By default, emulation is on. It is possible to
|
||||||
|
intermix emulation code with real floating point opcodes, as
|
||||||
|
long as the only type used is single or real.
|
||||||
|
|
||||||
There is no command-line counterpart for this directive.
|
|
||||||
|
|
||||||
\subsection{\var{\$G} : Generate 80286 code}
|
\subsection{\var{\$G} : Generate 80286 code}
|
||||||
|
|
||||||
This option is recognised for Turbo Pascal cmpatibility, but is ignored,
|
This option is recognised for Turbo Pascal compatibility, but is ignored,
|
||||||
because the compiler needs at least a 386 or higher class processor.
|
|
||||||
|
|
||||||
|
|
||||||
\subsection{\var{\$L} : Local symbol information}
|
\subsection{\var{\$L} : Local symbol information}
|
||||||
|
|
||||||
@ -348,15 +367,20 @@ mathematics.
|
|||||||
\subsection{\var{\$O} : Overlay code generation }
|
\subsection{\var{\$O} : Overlay code generation }
|
||||||
|
|
||||||
This switch is recognised for Turbo Pascal compatibility, but is otherwise
|
This switch is recognised for Turbo Pascal compatibility, but is otherwise
|
||||||
ignored, since the compiler requires a 386 or higher computer, with at
|
ignored.
|
||||||
least 4 Mb. of ram.
|
|
||||||
|
|
||||||
\subsection{\var{\$Q} : Overflow checking}
|
\subsection{\var{\$Q} : Overflow checking}
|
||||||
The \var{\{\$Q+\}} directive turns on integer overflow checking.
|
The \var{\{\$Q+\}} directive turns on integer overflow checking.
|
||||||
This means that the compiler inserts code to check for overflow when doing
|
This means that the compiler inserts code to check for overflow when doing
|
||||||
computations with an integer.
|
computations with an integer.
|
||||||
When an overflow occurs, the run-time library will print a message
|
When an overflow occurs, the run-time library will print a message
|
||||||
\var{Overflow at xxx}, and exit the program with exit code 1.
|
\var{Overflow at xxx}, and exit the program with exit code 215.
|
||||||
|
|
||||||
|
\emph{ Remark: } Overflow checking behaviour is not the same as in
|
||||||
|
Turbo Pascal since all arithmetic operations are done via 32-bit
|
||||||
|
values. Furthermore, the Inc() and Dec() standard system procedures
|
||||||
|
\emph{ are } checked for overflow in \fpc, while in Turbo Pascal they
|
||||||
|
are not.
|
||||||
|
|
||||||
Using the \var{\{\$Q-\}} switch switches off the overflow checking code
|
Using the \var{\{\$Q-\}} switch switches off the overflow checking code
|
||||||
generation.
|
generation.
|
||||||
@ -370,27 +394,26 @@ indices, enumeration types, subrange types, etc. Specifying the
|
|||||||
\var{\{\$R+\}} switch tells the computer to generate code to check these
|
\var{\{\$R+\}} switch tells the computer to generate code to check these
|
||||||
indices. If, at run-time, an index or enumeration type is specified that is
|
indices. If, at run-time, an index or enumeration type is specified that is
|
||||||
out of the declared range of the compiler, then a run-time error is
|
out of the declared range of the compiler, then a run-time error is
|
||||||
generated, and the program exits with exit code 1.
|
generated, and the program exits with exit code 201.
|
||||||
|
|
||||||
The \var{\{\$R-\}} switch tells the compiler not to generate range checking
|
The \var{\{\$R-\}} switch tells the compiler not to generate range checking
|
||||||
code. This may result in faulty program behaviour, but no run-time errors
|
code. This may result in faulty program behaviour, but no run-time errors
|
||||||
will be generated.
|
will be generated.
|
||||||
|
|
||||||
{\em Remark: } this has not been implemented completely yet.
|
{\em Remark: } Range checking for sets and enumerations are not yet fully
|
||||||
|
implemented.
|
||||||
|
|
||||||
\subsection{\var{\$S} : Stack checking}
|
\subsection{\var{\$S} : Stack checking}
|
||||||
The \var{\{\$S+\}} directive tells the compiler to generate stack checking
|
The \var{\{\$S+\}} directive tells the compiler to generate stack checking
|
||||||
code. This generates code to check if a stack overflow occurred, i.e. to see
|
code. This generates code to check if a stack overflow occurred, i.e. to see
|
||||||
whether the stack has grown beyond its maximally allowed size. If the stack
|
whether the stack has grown beyond its maximally allowed size. If the stack
|
||||||
grows beyond the maximum size, then a run-time error is generated, and the
|
grows beyond the maximum size, then a run-time error is generated, and the
|
||||||
program will exit with exit code 1.
|
program will exit with exit code 202.
|
||||||
|
|
||||||
Specifying \var{\{\$S-\}} will turn generation of stack-checking code off.
|
Specifying \var{\{\$S-\}} will turn generation of stack-checking code off.
|
||||||
|
|
||||||
There is no command-line switch which is equivalent to this directive.
|
The command-line compiler switch \var{-Ct} has the same effect as the
|
||||||
|
\var{\{\$S+\}} directive.
|
||||||
{\em Remark: } In principle, the stack is almost unlimited,
|
|
||||||
i.e. limited to the total free amount of memory on the computer.
|
|
||||||
|
|
||||||
|
|
||||||
\subsection{\var{\$X} : Extended syntax}
|
\subsection{\var{\$X} : Extended syntax}
|
||||||
@ -410,10 +433,10 @@ end;
|
|||||||
{$X-}
|
{$X-}
|
||||||
Func (A);
|
Func (A);
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
The reason this construct is supported is that
|
The reason this construct is supported is that you may wish to call a
|
||||||
you may wish to call a function for certain side-effects it has, but you
|
function for certain side-effects it has, but you don't need the function
|
||||||
don't need the function result. In this case you don't need to assign the
|
result. In this case you don't need to assign the function result, saving
|
||||||
function result, saving you an extra variable.
|
you an extra variable.
|
||||||
|
|
||||||
The command-line compiler switch \var{-Sa1} has the same effect as the
|
The command-line compiler switch \var{-Sa1} has the same effect as the
|
||||||
\var{\{\$X+\}} directive.
|
\var{\{\$X+\}} directive.
|
||||||
@ -500,7 +523,7 @@ you should change \var{v} with the version number of the compiler
|
|||||||
you're using, \var{r} with the release number and \var{p}
|
you're using, \var{r} with the release number and \var{p}
|
||||||
with the patch-number of the compiler. 'OS' needs to be changed by the type
|
with the patch-number of the compiler. 'OS' needs to be changed by the type
|
||||||
of operating system. Currently this can be one of \var{DOS}, \var{GO32V2},
|
of operating system. Currently this can be one of \var{DOS}, \var{GO32V2},
|
||||||
\var{LINUX}, \var{OS2} or \var{WIN32}. This symbol is undefined if you
|
\var{LINUX}, \var{OS2}, \var{WIN32}, \var{MACOS}, \var{AMIGA} or \var{ATARI}. This symbol is undefined if you
|
||||||
specify a target that is different from the platform you're compiling on.
|
specify a target that is different from the platform you're compiling on.
|
||||||
the \var{-TSomeOS} option on the command line will define the \var{SomeOS} symbol,
|
the \var{-TSomeOS} option on the command line will define the \var{SomeOS} symbol,
|
||||||
and will undefined the existing platform symbol\footnote{In versions prior to
|
and will undefined the existing platform symbol\footnote{In versions prior to
|
||||||
@ -809,7 +832,7 @@ need to compile with the \var{-Sm} command-line switch.
|
|||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
% Using assembly language
|
% Using assembly language
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\chapter{Using assembly language}
|
\chapter{Using Assembly language}
|
||||||
\label{ch:AsmLang}
|
\label{ch:AsmLang}
|
||||||
\fpc supports inserting of assembler instructions in your code. The
|
\fpc supports inserting of assembler instructions in your code. The
|
||||||
mechanism for this is the same as under Turbo Pascal. There are, however
|
mechanism for this is the same as under Turbo Pascal. There are, however
|
||||||
@ -817,7 +840,7 @@ some substantial differences, as will be explained in the following.
|
|||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
% Intel syntax
|
% Intel syntax
|
||||||
\section{Intel syntax}
|
\section{Intel syntax (Intel x86 only) }
|
||||||
\label{se:Intel}
|
\label{se:Intel}
|
||||||
|
|
||||||
As of version 0.9.7, \fpc supports Intel syntax in it's \var{asm} blocks.
|
As of version 0.9.7, \fpc supports Intel syntax in it's \var{asm} blocks.
|
||||||
@ -956,7 +979,7 @@ The Intel inline assembler supports the following macros :
|
|||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
% AT&T syntax
|
% AT&T syntax
|
||||||
\section{AT\&T Syntax}
|
\section{AT\&T Syntax (Intel x86 only) }
|
||||||
\label{se:AttSyntax}
|
\label{se:AttSyntax}
|
||||||
\fpc uses the \gnu \var{as} assembler to generate its object files. Since
|
\fpc uses the \gnu \var{as} assembler to generate its object files. Since
|
||||||
the \gnu assembler uses AT\&T assembly syntax, the code you write should
|
the \gnu assembler uses AT\&T assembly syntax, the code you write should
|
||||||
@ -1057,57 +1080,33 @@ they are pushed {\em right} to {\em left}, instead of left to right for
|
|||||||
Turbo Pascal. This is especially important if you have some assembly
|
Turbo Pascal. This is especially important if you have some assembly
|
||||||
subroutines in Turbo Pascal which you would like to translate to \fpc.
|
subroutines in Turbo Pascal which you would like to translate to \fpc.
|
||||||
|
|
||||||
Function results are returned in the first register, if they fit in the
|
Function results are returned in the accumulator, if they fit in the
|
||||||
register. For more information on this, see \sees{Stack}
|
register.
|
||||||
|
|
||||||
The registers are {\em not} saved when calling a function or procedure. If
|
The registers are {\em not} saved when calling a function or procedure. If
|
||||||
you want to call a procedure or function from assembly language, you must
|
you want to call a procedure or function from assembly language, you must
|
||||||
save any registers you wish to preserve.
|
save any registers you wish to preserve.
|
||||||
|
|
||||||
The first thing a procedure does is saving the base pointer, and setting the
|
The first thing a procedure does is saving the base pointer, and setting the
|
||||||
base (\var{\%ebp}) pointer equal to the stack pointer (\var{\%esp}).
|
base pointer equal to the stack pointer. References to the pushed parameters
|
||||||
References to the pushed parameters and local variables are constructed
|
and local variables are constructed using the base pointer.
|
||||||
using the base pointer.
|
|
||||||
|
|
||||||
In practice this amounts to the following assembly code as the procedure or
|
When the procedure or function exits, it clears the stack.
|
||||||
function header :
|
|
||||||
\begin{verbatim}
|
|
||||||
pushl %ebp
|
|
||||||
movl %esp,%ebp
|
|
||||||
\end{verbatim}
|
|
||||||
|
|
||||||
When the procedure or function exits, it clears the stack by means of the
|
|
||||||
\var{RET xx} call, where \var{xx} is the total size of the pushed parameters
|
|
||||||
on the stack. Thus, in case parameters with a total size of \var{xx} have
|
|
||||||
been passed to a function, the generated exit sequence looks as follows:
|
|
||||||
\begin{verbatim}
|
|
||||||
leave
|
|
||||||
ret $xx
|
|
||||||
\end{verbatim}
|
|
||||||
|
|
||||||
When you want your code to be called by a C library or used in a C
|
When you want your code to be called by a C library or used in a C
|
||||||
program, you will run into trouble because of this calling mechanism. In C,
|
program, you will run into trouble because of this calling mechanism. In C,
|
||||||
the calling procedure is expected to clear the stack, not the called
|
the calling procedure is expected to clear the stack, not the called
|
||||||
procedure. To avoid this problem, \fpc supports the \var{export} modifier.
|
procedure. In other words, the arguments still are on the stack when the
|
||||||
Procedures that are defined using the export modifier, use a C-compatible
|
procedure exits. To avoid this problem, \fpc supports the \var{export}
|
||||||
calling mechanism. This means that they can be called from a C program or
|
modifier. Procedures that are defined using the export modifier, use a
|
||||||
library, or that you can use them as a callback function.
|
C-compatible calling mechanism. This means that they can be called from a
|
||||||
|
C program or library, or that you can use them as a callback function.
|
||||||
|
|
||||||
This also means that you cannot call this procedure or function from your
|
This also means that you cannot call this procedure or function from your
|
||||||
own program, since your program uses the Pascal calling convention.
|
own program, since your program uses the Pascal calling convention.
|
||||||
However, in the exported function, you can of course call other Pascal
|
However, in the exported function, you can of course call other Pascal
|
||||||
routines.
|
routines.
|
||||||
|
|
||||||
Technically, the C calling mechanism is implemented by generating the
|
|
||||||
following exit sequence at the end of your function or procedure:
|
|
||||||
\begin{verbatim}
|
|
||||||
leave {Copies EBP to ESP, pops EBP from the stack.}
|
|
||||||
ret
|
|
||||||
\end{verbatim}
|
|
||||||
Comparing this exit sequence with the previous one makes it clear why you
|
|
||||||
cannot call this procedure from within Pascal: The arguments still are on
|
|
||||||
the stack when the procedure exits.
|
|
||||||
|
|
||||||
As of version 0.9.8, the \fpc compiler supports also the \var{cdecl} and
|
As of version 0.9.8, the \fpc compiler supports also the \var{cdecl} and
|
||||||
\var{stdcall} modifiers, as found in Delphi. The \var{cdecl} modifier does
|
\var{stdcall} modifiers, as found in Delphi. The \var{cdecl} modifier does
|
||||||
the same as the \var{export} modifier, and \var{stdcall} does nothing, since
|
the same as the \var{export} modifier, and \var{stdcall} does nothing, since
|
||||||
@ -1136,6 +1135,54 @@ popstack & Right-to-left & Caller & No \\ \hline
|
|||||||
|
|
||||||
More about this can be found in \seec{Linking} on linking.
|
More about this can be found in \seec{Linking} on linking.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\subsection{ Intel x86 calling conventions }
|
||||||
|
|
||||||
|
Standard entry code for procedures and functions is as follows on the
|
||||||
|
x86 architecture:
|
||||||
|
\begin{verbatim}
|
||||||
|
pushl %ebp
|
||||||
|
movl %esp,%ebp
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
The generated exit sequence for procedure and functions looks as follows:
|
||||||
|
\begin{verbatim}
|
||||||
|
leave
|
||||||
|
ret $xx
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
Where \var{xx} is the total size of the pushed parameters.
|
||||||
|
|
||||||
|
To have more information on function return values take a look at the
|
||||||
|
\seec{RegConvs} section.
|
||||||
|
|
||||||
|
|
||||||
|
\subsection{ Motorola 680x0 calling conventions }
|
||||||
|
|
||||||
|
Standard entry code for procedures and functions is as follows on the
|
||||||
|
680x0 architecture:
|
||||||
|
\begin{verbatim}
|
||||||
|
move.l a6,-(sp)
|
||||||
|
move.l sp,a6
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
The generated exit sequence for procedure and functions looks as follows:
|
||||||
|
\begin{verbatim}
|
||||||
|
unlk a6
|
||||||
|
move.l (sp)+,a0 ; Get return address
|
||||||
|
add.l #xx,sp ; Remove allocated stack
|
||||||
|
move.l a0,-(sp) ; Put back return address on top of the stack
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
Where \var{xx} is the total size of the pushed parameters.
|
||||||
|
|
||||||
|
To have more information on function return values take a look at the
|
||||||
|
\seec{RegConvs} section.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
% Telling the compiler what registers have changed
|
% Telling the compiler what registers have changed
|
||||||
\section{Telling the compiler what registers have changed}
|
\section{Telling the compiler what registers have changed}
|
||||||
@ -1153,9 +1200,8 @@ asm
|
|||||||
...
|
...
|
||||||
end ['R1',...,'Rn'];
|
end ['R1',...,'Rn'];
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
Here \var{R1} to \var{Rn} are the names of the (extended) registers you
|
Here \var{R1} to \var{Rn} are the names of the 32-bit registers you
|
||||||
modify in your assembly code. They can be one of \var{'EAX', 'EBX', 'ECX',
|
modify in your assembly code.
|
||||||
'EDX', 'EDI', 'ESI'} for the Intel processor.
|
|
||||||
|
|
||||||
As an example :
|
As an example :
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
@ -1167,6 +1213,27 @@ As an example :
|
|||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
This example tells the compiler that the \var{EAX} register was modified.
|
This example tells the compiler that the \var{EAX} register was modified.
|
||||||
|
|
||||||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
|
% Register conventions
|
||||||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
|
\section{Register Conventions}
|
||||||
|
\label{se:RegConvs}
|
||||||
|
|
||||||
|
The compiler has different register conventions, depending on the
|
||||||
|
target processor used.
|
||||||
|
|
||||||
|
\subsection{ Intel x86 version }
|
||||||
|
|
||||||
|
When optimizations are on, no register can be freely modified, without
|
||||||
|
first being saved and then restored. Otherwise, EDI is usually used as
|
||||||
|
a scratch register and can be freely used in assembler blocks.
|
||||||
|
|
||||||
|
\subsection{ Motorola 680x0 version }
|
||||||
|
|
||||||
|
Registers which can be freely modified without saving are registers
|
||||||
|
D0, D1, D6, A0, A1, and floating point registers FP2 to FP7. All other
|
||||||
|
registers are to be considered reserved and should be saved and then
|
||||||
|
restored when used in assembler blocks.
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
% Linking issues
|
% Linking issues
|
||||||
@ -1542,14 +1609,14 @@ instructions on how to use and declare objects, see \refref.
|
|||||||
When using objects that need virtual methods, the compiler uses two help
|
When using objects that need virtual methods, the compiler uses two help
|
||||||
procedures that are in the run-time library. They are called
|
procedures that are in the run-time library. They are called
|
||||||
\var{Help\_Destructor} and \var{Help\_Constructor}, and they are written in
|
\var{Help\_Destructor} and \var{Help\_Constructor}, and they are written in
|
||||||
assebly language. They are used to allocate the necessary memory if needed,
|
assembly language. They are used to allocate the necessary memory if needed,
|
||||||
and to insert the Virtual Method Table (VMT) pointer in the newly allocated
|
and to insert the Virtual Method Table (VMT) pointer in the newly allocated
|
||||||
object.
|
object.
|
||||||
|
|
||||||
When the compiler encounters a call to an object's constructor,
|
When the compiler encounters a call to an object's constructor,
|
||||||
it sets up the stack frame for the call, and inserts a call to the
|
it sets up the stack frame for the call, and inserts a call to the
|
||||||
\var{Help\_Constructor}
|
\var{Help\_Constructor}
|
||||||
procedure before issuing the call to the real constuctor.
|
procedure before issuing the call to the real constructor.
|
||||||
The helper procedure allocates the needed memory (if needed) and inserts the
|
The helper procedure allocates the needed memory (if needed) and inserts the
|
||||||
VMT pointer in the object. After that, the real constructor is called.
|
VMT pointer in the object. After that, the real constructor is called.
|
||||||
|
|
||||||
@ -1690,7 +1757,7 @@ set up the stack. Then it calls the main program.
|
|||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
% MMX Support
|
% MMX Support
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\chapter{MMX support}
|
\chapter{MMX support (Intel x86 only) }
|
||||||
\label{ch:MMXSupport}
|
\label{ch:MMXSupport}
|
||||||
|
|
||||||
\section{What is it about ?}
|
\section{What is it about ?}
|
||||||
@ -1846,7 +1913,7 @@ procedure.
|
|||||||
\label{se:ThirtytwoBit}
|
\label{se:ThirtytwoBit}
|
||||||
The \fpc Pascal compiler issues 32-bit code. This has several consequences:
|
The \fpc Pascal compiler issues 32-bit code. This has several consequences:
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item You need a i386 or higher processor to run the generated code. The
|
\item You need a 386 processor to run the generated code. The
|
||||||
compiler functions on a 286 when you compile it using Turbo Pascal,
|
compiler functions on a 286 when you compile it using Turbo Pascal,
|
||||||
but the generated programs cannot be assembled or executed.
|
but the generated programs cannot be assembled or executed.
|
||||||
\item You don't need to bother with segment selectors. Memory can be
|
\item You don't need to bother with segment selectors. Memory can be
|
||||||
@ -1909,8 +1976,8 @@ pointer to \var{self} is pushed on the stack.
|
|||||||
\item If the procedure or function is nested in another function or
|
\item If the procedure or function is nested in another function or
|
||||||
procedure, then the frame pointer of the parent procedure is pushed on the
|
procedure, then the frame pointer of the parent procedure is pushed on the
|
||||||
stack.
|
stack.
|
||||||
\item The return address is pushed on the stack (by the \var{Call}
|
\item The return address is pushed on the stack (This is done automatically
|
||||||
instruction).
|
by the instruction which calls the subroutine).
|
||||||
\end{enumerate}
|
\end{enumerate}
|
||||||
|
|
||||||
The resulting stack frame upon entering looks as in \seet{StackFrame}.
|
The resulting stack frame upon entering looks as in \seet{StackFrame}.
|
||||||
@ -1924,19 +1991,71 @@ Offset & What is stored & Optional ? \\ \hline
|
|||||||
+0 & Return address & No\\ \hline
|
+0 & Return address & No\\ \hline
|
||||||
\end{FPCltable}
|
\end{FPCltable}
|
||||||
|
|
||||||
|
\subsection{ Intel x86 version }
|
||||||
|
|
||||||
The stack is cleared with the \var{ret} I386 instruction, meaning that the
|
The stack is cleared with the \var{ret} I386 instruction, meaning that the
|
||||||
size of all pushed parameters is limited to 64K.
|
size of all pushed parameters is limited to 64K.
|
||||||
|
|
||||||
The stack size is unlimited for all supported platforms. On the \var{GO32V2}
|
\subsubsection{ DOS }
|
||||||
platform, the minimum guaranteed stack is 128Kb, but this can be set with
|
|
||||||
the \var{-Ctxxx} compiler switch.
|
|
||||||
|
|
||||||
|
Under the DOS targets , the default stack is set to 256Kb. This value
|
||||||
|
cannot be modified for the GO32V1 target. But this can be modified
|
||||||
|
with the GO32V2 target using a special DJGPP utility \var{stubedit}.
|
||||||
|
It is to note that the stack size may be changed with some compiler
|
||||||
|
switches, this stack size, if \emph{greater} then the default stack
|
||||||
|
size will be used instead, otherwise the default stack size is used.
|
||||||
|
|
||||||
|
\subsubsection{ Linux }
|
||||||
|
|
||||||
|
Under Linux, stack size is only limited by the available memory by
|
||||||
|
the system.
|
||||||
|
|
||||||
|
\subsubsection{ OS/2 }
|
||||||
|
|
||||||
|
Under OS/2, stack size is determined by one of the runtime
|
||||||
|
environment variables set for EMX. Therefore, the stack size
|
||||||
|
is user defined.
|
||||||
|
|
||||||
|
\subsection{ Motorola 680x0 version }
|
||||||
|
|
||||||
|
All depending on the processor target, the stack can be cleared in two
|
||||||
|
manners, if the target processor is a MC68020 or higher, the stack will
|
||||||
|
be cleared with a simple \var{rtd} instruction, meaning that the size
|
||||||
|
of all pushed parameters is limited to 32K.
|
||||||
|
|
||||||
|
Otherwise on MC68000/68010 processors, the stack clearing mechanism
|
||||||
|
is sligthly more complicated, the exit code will look like this:
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
{
|
||||||
|
move.l (sp)+,a0
|
||||||
|
add.l paramsize,a0
|
||||||
|
move.l a0,-(sp)
|
||||||
|
rts
|
||||||
|
}
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
\subsubsection{ Amiga }
|
||||||
|
|
||||||
|
Under AmigaOS, stack size is determined by the user, which sets this
|
||||||
|
value using the stack program. Typical sizes range from 4K to 40K.
|
||||||
|
|
||||||
|
\subsubsection{ Atari }
|
||||||
|
|
||||||
|
Under Atari TOS, stack size is currently limited to 8K, and it cannot
|
||||||
|
be modified. This may change in a future release of the compiler.
|
||||||
|
|
||||||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
|
% The heap
|
||||||
|
\section{The heap}
|
||||||
|
\label{se:Heap}
|
||||||
The heap is used to store all dynamic variables, and to store class
|
The heap is used to store all dynamic variables, and to store class
|
||||||
instances. The interface to the heap is the same as in Turbo Pascal,
|
instances. The interface to the heap is the same as in Turbo Pascal,
|
||||||
although the effects are maybe not the same. On top of that, the \fpc
|
although the effects are maybe not the same. On top of that, the \fpc
|
||||||
run-time library has some extra possibilities, not available in Turbo
|
run-time library has some extra possibilities, not available in Turbo
|
||||||
Pascal. These extra possibilities are explained in the next subsections.
|
Pascal. These extra possibilities are explained in the next subsections.
|
||||||
|
|
||||||
|
|
||||||
% The heap grows
|
% The heap grows
|
||||||
\subsection{The heap grows}
|
\subsection{The heap grows}
|
||||||
\fpc supports the \var{HeapEerror} procedural variable. If this variable is
|
\fpc supports the \var{HeapEerror} procedural variable. If this variable is
|
||||||
@ -2064,7 +2183,7 @@ ReleaseTempHeap;
|
|||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
% Accessing DOS memory under the GO32 extender
|
% Accessing DOS memory under the GO32 extender
|
||||||
\section{Accessing \dos memory under the Go32 extender}
|
\section{Accessing \dos memory under the Go32 extender (Intel x86 only) }
|
||||||
\label{se:AccessingDosMemory}
|
\label{se:AccessingDosMemory}
|
||||||
|
|
||||||
Because \fpc is a 32 bit compiler, and uses a \dos extender, accessing DOS
|
Because \fpc is a 32 bit compiler, and uses a \dos extender, accessing DOS
|
||||||
@ -2118,6 +2237,317 @@ After using the selector, you must free it again using the
|
|||||||
More information on all this can be found in the \unitsref, the chapter on
|
More information on all this can be found in the \unitsref, the chapter on
|
||||||
the \file{GO32} unit.
|
the \file{GO32} unit.
|
||||||
|
|
||||||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
|
% Optimizations done in the compiler
|
||||||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
|
\chapter{Optimizations}
|
||||||
|
|
||||||
|
\section{ Non processor specific }
|
||||||
|
|
||||||
|
The following sections describe the general optimizations
|
||||||
|
done by the compiler, they are non processor specific. Some
|
||||||
|
of these require some compiler switch override while others are done
|
||||||
|
automatically (those which require a switch will be noted as such).
|
||||||
|
|
||||||
|
\subsection{ Constant folding }
|
||||||
|
|
||||||
|
In \fpc, if the operand(s) of an operator are constants, they
|
||||||
|
will be evaluated at compile time.
|
||||||
|
|
||||||
|
Example
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
x:=1+2+3+6+5;
|
||||||
|
will generate the same code as
|
||||||
|
x:=17;
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
Furthermore, if an array index is a constant, the offset will
|
||||||
|
be evaluated at compile time. This means that accessing MyData[5]
|
||||||
|
is as efficient as accessing a normal variable.
|
||||||
|
|
||||||
|
Finally, calling \var{Chr}, \var{Hi}, \var{Lo}, \var{Ord}, \var{Pred},
|
||||||
|
or \var{Succ} functions with constant parameters generates no
|
||||||
|
run-time library calls, instead, the values are evaluated at
|
||||||
|
compile time.
|
||||||
|
|
||||||
|
\subsection{ Constant merging }
|
||||||
|
|
||||||
|
Using the same constant string two or more times generates only
|
||||||
|
one copy of the string constant.
|
||||||
|
|
||||||
|
\subsection{ Short cut evaluation }
|
||||||
|
|
||||||
|
Evaluation of boolean expression stops as soon as the result is
|
||||||
|
known, which makes code execute faster then if all boolean operands
|
||||||
|
were evaluted.
|
||||||
|
|
||||||
|
\subsection{ Constant set inlining }
|
||||||
|
|
||||||
|
Using the \var{in} operator is always more efficient then using the
|
||||||
|
equivalent <>, =, <=, >=, < and > operators. This is because
|
||||||
|
range comparisons can be done more easily with \var{in} then with
|
||||||
|
normal comparison operators.
|
||||||
|
|
||||||
|
\subsection{ Small sets }
|
||||||
|
|
||||||
|
Sets which contain less then 33 elements can be directly encoded
|
||||||
|
using a 32-bit value, therefore no run-time library calls to
|
||||||
|
evaluate operands on these sets are required; they are directly encoded
|
||||||
|
by the code generator.
|
||||||
|
|
||||||
|
\subsection{ Range checking }
|
||||||
|
|
||||||
|
Assignments of constants to variables are range checked at compile
|
||||||
|
time, which removes the need the generation of runtime range checking
|
||||||
|
code.
|
||||||
|
|
||||||
|
\emph{Remark:} This feature was not implemented before version
|
||||||
|
0.99.5 of \fpc.
|
||||||
|
|
||||||
|
\subsection{ Shifts instead of multiply or divide }
|
||||||
|
|
||||||
|
When one of the operands in a multiplication is a power of
|
||||||
|
two, they are encoded using arithmetic shifts instructions,
|
||||||
|
which generates more efficient code.
|
||||||
|
|
||||||
|
Similarly, if the divisor in a \var{div} operation is a power
|
||||||
|
of two, it is encoded using arithmetic shifts instructions.
|
||||||
|
|
||||||
|
The same is true when accessing array indexes which are
|
||||||
|
powers of two, the address is calculated using arithmetic
|
||||||
|
shifts instead of the multiply instruction.
|
||||||
|
|
||||||
|
\subsection{ Automatic alignment }
|
||||||
|
|
||||||
|
By default all variables larger then a byte are guaranteed to be aligned
|
||||||
|
at least on a word boundary.
|
||||||
|
|
||||||
|
Furthermore all pointers allocated using the standard runtime
|
||||||
|
library (\var{New} and \var{GetMem} among others) are guaranteed
|
||||||
|
to return pointers aligned on a quadword boundary (64-bit alignment).
|
||||||
|
|
||||||
|
Alignment of variables on the stack depends on the target processor.
|
||||||
|
|
||||||
|
\emph{ Remark: } Quadword alignment of pointers is not guaranteed
|
||||||
|
on systems which don't use an internal heap, such as for the Win32
|
||||||
|
target.
|
||||||
|
|
||||||
|
\emph{ Remark: } Alignment is also done \emph{between} fields in
|
||||||
|
records, objects and classes, this is \emph{not} the same as
|
||||||
|
in Turbo Pascal and may cause problems when using disk I/O with these
|
||||||
|
types. To get no alignment between fields use the \var{packed} directive
|
||||||
|
or the \var{\{\$PackRecords n\}} switch. For further information, take a
|
||||||
|
look at the reference manual under the \var{record} heading.
|
||||||
|
|
||||||
|
\subsection{ Smart linking }
|
||||||
|
|
||||||
|
This feature removes all unreferenced code in the final executable
|
||||||
|
file, making the executable file much smaller.
|
||||||
|
|
||||||
|
\emph{ Remark: } Smart linking was implemented starting with
|
||||||
|
version 0.99.6 of \fpc.
|
||||||
|
|
||||||
|
\subsection{ Inline routines }
|
||||||
|
|
||||||
|
The following runtime library routines are coded directly into the
|
||||||
|
final executable : \var{Lo}, \var{Hi}, \var{High}, \var{Sizeof},
|
||||||
|
\var{TypeOf}, \var{Length}, \var{Pred}, \var{Succ}, \var{Inc},
|
||||||
|
\var{Dec} and \var{Assigned}.
|
||||||
|
|
||||||
|
\emph{ Remark: } Inline \var{Inc} and \var{Dec} were not completely
|
||||||
|
implemented until version 0.99.6 of \fpc.
|
||||||
|
|
||||||
|
\subsection{ Case optimization }
|
||||||
|
|
||||||
|
When using the \var{-Oa} switch, case statements in certain cases will
|
||||||
|
be decoded using a jump table, which in certain cases will make the
|
||||||
|
case statement execute faster.
|
||||||
|
|
||||||
|
\subsection{ Stack frame omission }
|
||||||
|
|
||||||
|
When using the \var{-Ox} switch, under certain specific conditions,
|
||||||
|
the stack frame (entry and exit code for the routine) will be omitted, and
|
||||||
|
the variable will directly be accessed via the stack pointer.
|
||||||
|
|
||||||
|
Conditions for omission of the stack frame :
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item Routine does not call other routines
|
||||||
|
\item Routine does not contain assembler statements
|
||||||
|
\item Routine is not declared using the \var{Interrupt} directive
|
||||||
|
\item Routine is not a constructor or destructor
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsection{ Register variables }
|
||||||
|
|
||||||
|
When using the \var{-Ox} switch, local variables or parameters
|
||||||
|
which are used very often will be moved to registers for faster
|
||||||
|
access.
|
||||||
|
|
||||||
|
\emph{ Remark: } Register variable allocation is currently
|
||||||
|
broken and should not be used.
|
||||||
|
|
||||||
|
\subsection{ Intel x86 specific }
|
||||||
|
|
||||||
|
Here follows a listing of the opimizing techniques used in the compiler:
|
||||||
|
\begin{enumerate}
|
||||||
|
\item When optimizing for a specific Processor (\var{-O3, -O4, -O5 -O6},
|
||||||
|
the following is done:
|
||||||
|
\begin{itemize}
|
||||||
|
\item In \var{case} statements, a check is done whether a jump table
|
||||||
|
or a sequence of conditional jumps should be used for optimal performance.
|
||||||
|
\item Determines a number of strategies when doing peephole optimization:
|
||||||
|
\var{movzbl (\%ebp), \%eax} on PentiumPro and PII systems will be changed
|
||||||
|
into \var{xorl \%eax,\%eax; movb (\%ebp),\%al } for lesser systems.
|
||||||
|
\end{itemize}
|
||||||
|
Cyrix \var{6x86} processor owners should optimize with \var{-O4} instead of
|
||||||
|
\var{-O5}, because \var{-O5} leads to larger code, and thus to smaller
|
||||||
|
speed, according to the Cyrix developers FAQ.
|
||||||
|
\item When optimizing for speed (\var{-OG}) or size (\var{-Og}), a choice is
|
||||||
|
made between using shorter instructions (for size) such as \var{enter \$4},
|
||||||
|
or longer instructions \var{subl \$4,\%esp} for speed. When smaller size is
|
||||||
|
requested, things aren't aligned on 4-byte boundaries. When speed is
|
||||||
|
requested, things are aligned on 4-byte boundaries as much as possible.
|
||||||
|
\item Simple optimization (\var{-Oa}) makes sure the peephole optimizer is
|
||||||
|
used, as well as the reloading optimizer.
|
||||||
|
\item Uncertain optimizations (\var{-Oz}): With this switch, the reloading
|
||||||
|
optimizer (enabled with \var{-Oa}) can be forced into making uncertain
|
||||||
|
optimizations.
|
||||||
|
|
||||||
|
You can enable uncertain optimizations only in certain cases,
|
||||||
|
otherwise you will produce a bug; the following technical description
|
||||||
|
tells you when to use them:
|
||||||
|
\begin{quote}
|
||||||
|
% Jonas's own words..
|
||||||
|
\em
|
||||||
|
If uncertain optimizations are enabled, the reloading optimizer assumes
|
||||||
|
that
|
||||||
|
\begin{itemize}
|
||||||
|
\item If something is written to a local/global register or a
|
||||||
|
procedure/function parameter, this value doesn't overwrite the value to
|
||||||
|
which a pointer points.
|
||||||
|
\item If something is written to memory pointed to by a pointer variable,
|
||||||
|
this value doesn't overwrite the value of a local/global variable or a
|
||||||
|
procedure/function parameter.
|
||||||
|
\end{itemize}
|
||||||
|
% end of quote
|
||||||
|
\end{quote}
|
||||||
|
The practical upshot of this is that you cannot use the uncertain
|
||||||
|
optimizations if you access any local or global variables through pointers. In
|
||||||
|
theory, this includes \var{Var} parameters, but it is all right
|
||||||
|
if you don't both read the variable once through its \var{Var} reference
|
||||||
|
and then read it using it's name.
|
||||||
|
|
||||||
|
The following example will produce bad code when you switch on
|
||||||
|
uncertain optimizations:
|
||||||
|
\begin{verbatim}
|
||||||
|
Var temp: Longint;
|
||||||
|
|
||||||
|
Procedure Foo(Var Bar: Longint);
|
||||||
|
Begin
|
||||||
|
If (Bar = temp)
|
||||||
|
Then
|
||||||
|
Begin
|
||||||
|
Inc(Bar);
|
||||||
|
If (Bar <> temp) then Writeln('bug!')
|
||||||
|
End
|
||||||
|
End;
|
||||||
|
|
||||||
|
Begin
|
||||||
|
Foo(Temp);
|
||||||
|
End.
|
||||||
|
\end{verbatim}
|
||||||
|
The reason it produces bad code is because you access the global variable
|
||||||
|
\var{Temp} both through its name \var{Temp} and through a pointer, in this
|
||||||
|
case using the \var{Bar} variable parameter, which is nothing but a pointer
|
||||||
|
to \var{Temp} in the above code.
|
||||||
|
|
||||||
|
On the other hand, you can use the uncertain optimizations if
|
||||||
|
you access global/local variables or parameters through pointers,
|
||||||
|
and {\em only} access them through this pointer\footnote{
|
||||||
|
You can use multiple pointers to point to the same variable as well, that
|
||||||
|
doesn't matter.}.
|
||||||
|
|
||||||
|
For example:
|
||||||
|
\begin{verbatim}
|
||||||
|
Type TMyRec = Record
|
||||||
|
a, b: Longint;
|
||||||
|
End;
|
||||||
|
PMyRec = ^TMyRec;
|
||||||
|
|
||||||
|
|
||||||
|
TMyRecArray = Array [1..100000] of TMyRec;
|
||||||
|
PMyRecArray = ^TMyRecArray;
|
||||||
|
|
||||||
|
Var MyRecArrayPtr: PMyRecArray;
|
||||||
|
MyRecPtr: PMyRec;
|
||||||
|
Counter: Longint;
|
||||||
|
|
||||||
|
Begin
|
||||||
|
New(MyRecArrayPtr);
|
||||||
|
For Counter := 1 to 100000 Do
|
||||||
|
Begin
|
||||||
|
MyRecPtr := @MyRecArrayPtr^[Counter];
|
||||||
|
MyRecPtr^.a := Counter;
|
||||||
|
MyRecPtr^.b := Counter div 2;
|
||||||
|
End;
|
||||||
|
End.
|
||||||
|
\end{verbatim}
|
||||||
|
Will produce correct code, because the global variable \var{MyRecArrayPtr}
|
||||||
|
is not accessed directly, but through a pointer (\var{MyRecPtr} in this
|
||||||
|
case).
|
||||||
|
|
||||||
|
In conclusion, one could say that you can use uncertain optimizations {\em
|
||||||
|
only} when you know what you're doing.
|
||||||
|
\end{enumerate}
|
||||||
|
|
||||||
|
\subsection{ Motorola 680x0 specific }
|
||||||
|
|
||||||
|
Using the \var{-O2} switch does several optimizations in the
|
||||||
|
code produced, the most notable being:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item Sign extension from byte to long will use \var{EXTB}
|
||||||
|
\item Returning of functions will use \var{RTD}
|
||||||
|
\item Range checking will generate no run-time calls
|
||||||
|
\item Multiplication will use the long \var{MULS} instruction, no
|
||||||
|
runtime library call will be generated
|
||||||
|
\item Division will use the long \var{DIVS} instruction, no
|
||||||
|
runtime library call will be generated
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
|
||||||
|
\section{ Floating point }
|
||||||
|
|
||||||
|
This is where can be found processor specific information on Floating
|
||||||
|
point code generated by the compiler.
|
||||||
|
|
||||||
|
\subsection{ Intel x86 specific }
|
||||||
|
|
||||||
|
All normal floating point types map to their real type, including
|
||||||
|
\var{comp} and \var{extended}.
|
||||||
|
|
||||||
|
\subsection{ Motorola 680x0 specific }
|
||||||
|
|
||||||
|
Early generations of the Motorola 680x0 processors did not have integrated
|
||||||
|
floating point units, so to circumvent this fact, all floating point
|
||||||
|
operations are emulated (when the \var{\$E+} switch ,which is the default)
|
||||||
|
using the IEEE \var{Single} floating point type. In other words when
|
||||||
|
emulation is on, Real, Single, Double and Extended all map to the
|
||||||
|
\var{single} floating point type.
|
||||||
|
|
||||||
|
When the \var{\$E} switch is turned off, normal 68882/68881/68040
|
||||||
|
floating point opcodes are emitted. The Real type still maps to
|
||||||
|
\var{Single} but the other types map to their true floating point
|
||||||
|
types. Only basic FPU opcodes are used, which means that it can
|
||||||
|
work on 68040 processors correctly.
|
||||||
|
|
||||||
|
\emph{ Remark: } \var{Double} and \var{Extended} types in true floating
|
||||||
|
point mode have not been extensively tested as of version 0.99.5.
|
||||||
|
|
||||||
|
\emph{ Remark: } The \var{comp} data type is currently not supported.
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
% Appendices
|
% Appendices
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
@ -2281,130 +2711,6 @@ changed by changing the \var{bytearray1} type in \file{cobjects.pas}
|
|||||||
compiler. When using the 32-bit compiler, the limit is set to 1024. You can
|
compiler. When using the 32-bit compiler, the limit is set to 1024. You can
|
||||||
change this by redefining the \var{maxunits} constant in the
|
change this by redefining the \var{maxunits} constant in the
|
||||||
\file{files.pas} compiler source file.
|
\file{files.pas} compiler source file.
|
||||||
\item Procedures or functions accept parameters with a total size up to
|
|
||||||
\var{\$ffff} bytes. This limit is due to the \var{RET} instruction of the I386
|
|
||||||
processor. If the calls were made using the C convention this limit would
|
|
||||||
disappear.
|
|
||||||
\end{enumerate}
|
\end{enumerate}
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
% Appendix D
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
|
|
||||||
\chapter{Optimizing techniques used in the compiler.}
|
|
||||||
Here follows a listing of the opimizing techniques used in the compiler:
|
|
||||||
\begin{enumerate}
|
|
||||||
\item When optimizing for a specific Processor (\var{-O3, -O4, -O5 -O6},
|
|
||||||
the following is done:
|
|
||||||
\begin{itemize}
|
|
||||||
\item In \var{case} statements, a check is done whether a jump table
|
|
||||||
or a sequence of conditional jumps should be used for optimal performance.
|
|
||||||
\item Determines a number of strategies when doing peephole optimization:
|
|
||||||
\var{movzbl (\%ebp), \%eax} on PentiumPro and PII systems will be changed
|
|
||||||
into \var{xorl \%eax,\%eax; movb (\%ebp),\%al } for lesser systems.
|
|
||||||
\end{itemize}
|
|
||||||
Cyrix \var{6x86} processor owners should optimize with \var{-O4} instead of
|
|
||||||
\var{-O5}, because \var{-O5} leads to larger code, and thus to smaller
|
|
||||||
speed, according to the Cyrix developers FAQ.
|
|
||||||
\item When optimizing for speed (\var{-OG}) or size (\var{-Og}), a choice is
|
|
||||||
made between using shorter instructions (for size) such as \var{enter \$4},
|
|
||||||
or longer instructions \var{subl \$4,\%esp} for speed. When smaller size is
|
|
||||||
requested, things aren't aligned on 4-byte boundaries. When speed is
|
|
||||||
requested, things are aligned on 4-byte boundaries as much as possible.
|
|
||||||
\item Simple optimization (\var{-Oa}) makes sure the peephole optimizer is
|
|
||||||
used, as well as the reloading optimizer.
|
|
||||||
\item Maximum optimization (\var{-Ox}) avoids creation of stack frames if
|
|
||||||
they aren't required, and unnecessary loading of registers is avoided as
|
|
||||||
much as possible. (buggy at the moment (version 0.99.0).
|
|
||||||
\item Uncertain optimizations (\var{-Oz}): With this switch, the reloading
|
|
||||||
optimizer (enabled with \var{-Oa}) can be forced into making uncertain
|
|
||||||
optimizations.
|
|
||||||
|
|
||||||
You can enable uncertain optimizations only in certain cases,
|
|
||||||
otherwise you will produce a bug; the following technical description
|
|
||||||
tells you when to use them:
|
|
||||||
\begin{quote}
|
|
||||||
% Jonas's own words..
|
|
||||||
\em
|
|
||||||
If uncertain optimizations are enabled, the reloading optimizer assumes
|
|
||||||
that
|
|
||||||
\begin{itemize}
|
|
||||||
\item If something is written to a local/global register or a
|
|
||||||
procedure/function parameter, this value doesn't overwrite the value to
|
|
||||||
which a pointer points.
|
|
||||||
\item If something is written to memory pointed to by a pointer variable,
|
|
||||||
this value doesn't overwrite the value of a local/global variable or a
|
|
||||||
procedure/function parameter.
|
|
||||||
\end{itemize}
|
|
||||||
% end of quote
|
|
||||||
\end{quote}
|
|
||||||
The practical upshot of this is that you cannot use the uncertain
|
|
||||||
optimizations if you access any local or global variables through pointers. In
|
|
||||||
theory, this includes \var{Var} parameters, but it is all right
|
|
||||||
if you don't both read the variable once through its \var{Var} reference
|
|
||||||
and then read it using it's name.
|
|
||||||
|
|
||||||
The following example will produce bad code when you switch on
|
|
||||||
uncertain optimizations:
|
|
||||||
\begin{verbatim}
|
|
||||||
Var temp: Longint;
|
|
||||||
|
|
||||||
Procedure Foo(Var Bar: Longint);
|
|
||||||
Begin
|
|
||||||
If (Bar = temp)
|
|
||||||
Then
|
|
||||||
Begin
|
|
||||||
Inc(Bar);
|
|
||||||
If (Bar <> temp) then Writeln('bug!')
|
|
||||||
End
|
|
||||||
End;
|
|
||||||
|
|
||||||
Begin
|
|
||||||
Foo(Temp);
|
|
||||||
End.
|
|
||||||
\end{verbatim}
|
|
||||||
The reason it produces bad code is because you access the global variable
|
|
||||||
\var{Temp} both through its name \var{Temp} and through a pointer, in this
|
|
||||||
case using the \var{Bar} variable parameter, which is nothing but a pointer
|
|
||||||
to \var{Temp} in the above code.
|
|
||||||
|
|
||||||
On the other hand, you can use the uncertain optimizations if
|
|
||||||
you access global/local variables or parameters through pointers,
|
|
||||||
and {\em only} access them through this pointer\footnote{
|
|
||||||
You can use multiple pointers to point to the same variable as well, that
|
|
||||||
doesn't matter.}.
|
|
||||||
|
|
||||||
For example:
|
|
||||||
\begin{verbatim}
|
|
||||||
Type TMyRec = Record
|
|
||||||
a, b: Longint;
|
|
||||||
End;
|
|
||||||
PMyRec = ^TMyRec;
|
|
||||||
|
|
||||||
|
|
||||||
TMyRecArray = Array [1..100000] of TMyRec;
|
|
||||||
PMyRecArray = ^TMyRecArray;
|
|
||||||
|
|
||||||
Var MyRecArrayPtr: PMyRecArray;
|
|
||||||
MyRecPtr: PMyRec;
|
|
||||||
Counter: Longint;
|
|
||||||
|
|
||||||
Begin
|
|
||||||
New(MyRecArrayPtr);
|
|
||||||
For Counter := 1 to 100000 Do
|
|
||||||
Begin
|
|
||||||
MyRecPtr := @MyRecArrayPtr^[Counter];
|
|
||||||
MyRecPtr^.a := Counter;
|
|
||||||
MyRecPtr^.b := Counter div 2;
|
|
||||||
End;
|
|
||||||
End.
|
|
||||||
\end{verbatim}
|
|
||||||
Will produce correct code, because the global variable \var{MyRecArrayPtr}
|
|
||||||
is not accessed directly, but through a pointer (\var{MyRecPtr} in this
|
|
||||||
case).
|
|
||||||
|
|
||||||
In conclusion, one could say that you can use uncertain optimizations {\em
|
|
||||||
only} when you know what you're doing.
|
|
||||||
\end{enumerate}
|
|
||||||
\end{document}
|
\end{document}
|
||||||
|
62
docs/ref.tex
62
docs/ref.tex
@ -114,8 +114,8 @@ percent sign (\var{\%}). Thus, \var{255} can be specified in binary notation
|
|||||||
as \var{\%11111111}.
|
as \var{\%11111111}.
|
||||||
|
|
||||||
\subsection{Real types}
|
\subsection{Real types}
|
||||||
\fpc uses the math coprocessor (or an emulation) for al its floating-point
|
\fpc uses the math coprocessor (or an emulation) for all its floating-point
|
||||||
calculations. The Real native type for is processor dependant,
|
calculations. The Real native type is processor dependant,
|
||||||
but it is either Single or Double. Only the IEEE floating point type are
|
but it is either Single or Double. Only the IEEE floating point type are
|
||||||
supported, and these depend on the target processor and emulation options.
|
supported, and these depend on the target processor and emulation options.
|
||||||
The true Turbo Pascal compatible types are listed in
|
The true Turbo Pascal compatible types are listed in
|
||||||
@ -812,21 +812,6 @@ command-line switch.
|
|||||||
{\em Remark:} These constructions are just for typing convenience, they
|
{\em Remark:} These constructions are just for typing convenience, they
|
||||||
don't generate different code.
|
don't generate different code.
|
||||||
|
|
||||||
\fpc also supports typed assignments. This means that an assignment
|
|
||||||
statement has a definite type, and hence can be assigned to another
|
|
||||||
variable. The type of the assignment \var{a:=b} is the type of \var{a}
|
|
||||||
(or, in this case, of \var{b}), and this can be assigned to another
|
|
||||||
variable : \var{c:=a:=b;}.
|
|
||||||
To summarize: the construct
|
|
||||||
\begin{verbatim}
|
|
||||||
a:=b:=c;
|
|
||||||
\end{verbatim}
|
|
||||||
results in both \var{a} and \var{b} being assign the value of \var{c}, which
|
|
||||||
may be an expression.
|
|
||||||
|
|
||||||
For this construct to be allowed, it is necessary to specify the \var{-Sa4}
|
|
||||||
switch on the command line.
|
|
||||||
|
|
||||||
\subsection{The \var{Case} statement}
|
\subsection{The \var{Case} statement}
|
||||||
\fpc supports the \var{case} statement. Its prototype is
|
\fpc supports the \var{case} statement. Its prototype is
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
@ -968,7 +953,11 @@ Be aware of the fact that the boolean expressions \var{Expression1} and
|
|||||||
will be stopped at the point where the outcome is known with certainty)
|
will be stopped at the point where the outcome is known with certainty)
|
||||||
|
|
||||||
\subsection{The \var{With} statement}
|
\subsection{The \var{With} statement}
|
||||||
The with statement serves to access the elements of a record, without
|
|
||||||
|
The with statement serves to access the elements of a record\footnote{
|
||||||
|
The \var{with} statement does not work correctly when used with
|
||||||
|
objects or classes until version 0.99.6}
|
||||||
|
, without
|
||||||
having to specify the name of the record. Given the declaration:
|
having to specify the name of the record. Given the declaration:
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
Type Passenger = Record
|
Type Passenger = Record
|
||||||
@ -1063,10 +1052,12 @@ are also declared with open arrays as parameters, {\em not} to functions or
|
|||||||
procedures which accept arrays of fixed length.
|
procedures which accept arrays of fixed length.
|
||||||
|
|
||||||
\section{Using assembler in your code}
|
\section{Using assembler in your code}
|
||||||
|
|
||||||
\fpc supports the use of assembler in your code, but not inline
|
\fpc supports the use of assembler in your code, but not inline
|
||||||
assembler macros. Assembly functions (i.e. functions declared with the
|
assembler macros. To have more information on the processor
|
||||||
\var{Assembler} keyword) are supported as of version 0.9.7. (see
|
specific assembler syntax and its limitations, see the \progref.
|
||||||
\progref for more information about this).
|
|
||||||
|
\subsection{ Assembler statements }
|
||||||
|
|
||||||
The following is an example of assembler inclusion in your code.
|
The following is an example of assembler inclusion in your code.
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
@ -1090,6 +1081,34 @@ recognise it, and treat it as any other conditionals.
|
|||||||
\emph{ Remark: } Before version 0.99.1, \fpc did not support
|
\emph{ Remark: } Before version 0.99.1, \fpc did not support
|
||||||
reference to variables by their names in the assembler parts of your code.
|
reference to variables by their names in the assembler parts of your code.
|
||||||
|
|
||||||
|
\subsection{ Assembler procedures and functions }
|
||||||
|
|
||||||
|
Assembler procedures and functions are declared using the
|
||||||
|
\var{Assembler} directive. The \var{Assembler} keyword is supported
|
||||||
|
as of version 0.9.7. This permits the code generator to make a number
|
||||||
|
of code generation optimizations.
|
||||||
|
|
||||||
|
The code generator does not generate any stack frame (entry and exit
|
||||||
|
code for the routine) if it contains no local variables. In the case
|
||||||
|
of functions, ordinal values must be returned in the accumulator. In
|
||||||
|
the case of floating point values, these depend on the target processor
|
||||||
|
and emulation options.
|
||||||
|
|
||||||
|
\emph{ Remark: } Before version 0.99.1, \fpc did not support
|
||||||
|
reference to variables by their names in the assembler parts of your code.
|
||||||
|
|
||||||
|
\emph{ Remark: } Currently, the \var{Assembler} directive has not the
|
||||||
|
same effect as in Turbo Pascal, so beware! In \fpc, parameters are
|
||||||
|
treated normally, which is not the case in Turbo Pascal. Furthermore,
|
||||||
|
the stack frame will be omitted if there are no local variables, in this
|
||||||
|
case if the assembly routine has any parameters, they will be referenced
|
||||||
|
directly via the stack pointer. This is \em{ NOT} like Turbo Pascal where
|
||||||
|
the stack frame is only omitted if there are no parameters \em{ and } no
|
||||||
|
local variables. Therefore, if your assembly routines will modify the stack
|
||||||
|
pointer, such as when pushing or popping values on the stack, the
|
||||||
|
\var{Assembler} keyword should not be used. Instead, use a normal procedure
|
||||||
|
with \var{Asm} blocks.
|
||||||
|
|
||||||
\section{Modifiers}
|
\section{Modifiers}
|
||||||
\fpc doesn't support all Turbo Pascal modifiers, but
|
\fpc doesn't support all Turbo Pascal modifiers, but
|
||||||
does support a number of additional modifiers. They are used mainly for assembler and
|
does support a number of additional modifiers. They are used mainly for assembler and
|
||||||
@ -1207,7 +1226,6 @@ function must be exactly the same.
|
|||||||
The \var{external} modifier has also an extended syntax:
|
The \var{external} modifier has also an extended syntax:
|
||||||
\begin{enumerate}
|
\begin{enumerate}
|
||||||
\item
|
\item
|
||||||
|
|
||||||
\begin{verbatim}
|
\begin{verbatim}
|
||||||
external 'lname';
|
external 'lname';
|
||||||
\end{verbatim}
|
\end{verbatim}
|
||||||
|
Loading…
Reference in New Issue
Block a user