* first usable version

This commit is contained in:
florian 1998-09-06 17:09:47 +00:00
parent 58027e2792
commit c4afb6c8db

View File

@ -16,7 +16,7 @@
% You should have received a copy of the GNU Library General Public
% License along with the FPC documentation; see the file COPYING.LIB. If not,
% write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
% Boston, MA 02111-1307, USA.
% Boston, MA 02111-1307, USA.
%
\documentclass{report}
\usepackage{a4}
@ -24,20 +24,20 @@
\makeindex
\latex{\usepackage{multicol}}
\latex{\usepackage{fpcman}}
\latex{\usepackage{epsfig}}
\html{\input{fpc-html.tex}}
\newcommand{\remark}[1]{\par$\rightarrow$\textbf{#1}\par}
\newcommand{\olabel}[1]{\label{option:#1}}
% We should change this to something better. See \seef etc.
\newcommand{\seeo}[1]{See \ref{option:#1}}
\begin{document}
\title{Inside Free Pascal}
\docdescription{Internal documentation for \fpc, version \fpcversion}
\docversion{1.2}
\date{March 1998}
\author{Florian Kl\"ampfl}
\title{Free Pascal :\\ Compiler documentation}
\docdescription{Compiler documentation for \fpc, version \fpcversion}
\docversion{1.0}
\date{September 1998}
\author{Micha\"el Van Canneyt\\Florian Kl\"ampfl}
\maketitle
\tableofcontents
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Introduction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -61,6 +61,19 @@ The \file{README} files are, in case of conflict with this manual,
I hope, my poor english is quite understandable. Feel free to correct
spelling mistakes.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% About the compiler
\section{About the compiler}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Getting more information.
\section{Getting more information.}
The ultimative source for informations about compiler internals is
the compiler source though it isn't very well documented. If you
need more infomrations you should join the developers mailing
list or you can contact the developers.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Overview
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -69,36 +82,477 @@ spelling mistakes.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The scanner
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{The scanner}
%% \chapter{The scanner}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The symbol tables
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{The symbol tables}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Symbols
\section{Symbols}
The symbol table is used to store informations about all
symbols, declarations and definitions in a program.
In an abtract view, a symbol table is a data base with a string field
as index. \fpc implements the symbol table mainly as a binary tree,
for big symbol tables some hash technics are used. The implementation
can be found in symtable.pas, object tsymtable.
The symbol table module can't be associated with a stage of the compiler,
each stage does accesses to the symbol table.
The scanner uses a symbol table to handle preprocessor symbols, the
parser inserts declaration and the code generator uses the collected
informations about symbols and types to generate the code.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Definitions
\section{Definitions}
Definitions are one of the importantest data structures in \fpc.
They are used to describe types, for example the type of a variable
symbol is given by a definition and the result type
of a expression is given as a definition.
They have nothing to do with the definition of a procedure.
Definitions are implemented as a object (symtable.pas, tdef and
it's decendants). There are a lot of different
definitions: for example to describe
ordinal type, arrays, pointers, procedures
To make it more clear let's have a look to the fields of tdef:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Symbols
%% \section{Symbols}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Working with symbol tables
\section{Working with symbol tables}
%% \section{Working with symbol tables}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The parse tree
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{The parse tree}
%% \chapter{The parse tree}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The parser
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{The parser}
%% \chapter{The parser}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The semantical analysis
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% \chapter{The semantical analysis}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The code generation
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{The code generation}
%% \chapter{The code generation}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The assembler writers
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{The assembler writers}
\fpc doesn't generate machine language, it generates
assembler which must be assembled and linked.
The assembler output is configurable, \fpc can create
assembler for the GNU AS, the NASM (Netwide assembler) and
the assemblers of Borland and Microsoft. The default assembler
is the GNU AS, because it is fast and and available on
many platforms. Why don't we use the NASM? It is 2-4 times
slower than the GNU AS and it is create for
man kind written assembler, while the GNU AS is designed
as back end for a compiler.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Miscalleanous
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% \chapter{Miscalleanous}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The register allocation
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{The register allocation}
The register allocation is very hairy, so it gets
an own chapter in that manual. Please be careful when changing things
regarding the register allocation and test such changes intensive.
Future versions will may be implement another kind of register allocation
to make this part of the compiler more robust, see
\ref{se:future_plans}. But the current
system is less or more working and changing it would be a lot of
work, so we have to live with it.
The current register allocation mechanism was implement 5 years
ago and I didn't think, that the compiler becomes
so popular, so not much time was spend in the design
of the register allocation.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Basics
\section{Basics}
The register allocation is done in the first and second pass of
the compiler.
The first pass of a node has to calculate how much registers
are necessary to generate code for the node, it have
also to take care of child nodes i.e. how much registers
they need.
The register allocation is done via \var{getregister\*}
(where * is \var{32} or \var{mmx}).
Registers can be released via \var{ungetregister\*}. All registers
of a reference (i.e.base and index) can be released by
\var{del\_reference}. These procedures take care of the register type,
i.e. stack/base registers and registers allocated by register
variables aren't added to the set of unused registers.
If there is a problem in the register allocation an \var{internalerror(10)}
occurs.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% A simple example
\section{A simple example}
\subsection{The first pass}
This is a part of the first pass for a pointer dereferencation
(\var{p\^}), the type determination and some other stuff are left out
\begin{verbatim}
procedure firstderef(var p : ptree);
begin
// .....
// first pass of the child node
firstpass(p^.left);
// .....
// to dereference a pointer we need one one register
// but if the child node needs more registers, we
// have to pass this to our parent node
p^.registers32:=max(p^.left^.registers32,1);
// a pointer dereferencation doesn't need
// fpu or mmx registers
p^.registersfpu:=p^.left^.registersfpu;
p^.registersmmx:=p^.left^.registersmmx;
// .....
end;
\end{verbatim}
\subsection{The second pass}
The following code contains the complete second pass for
a pointer dereferencing node as it is used by current
compiler versions:
\begin{verbatim}
procedure secondderef(var p : ptree);
var
hr : tregister;
begin
// second pass of the child node, this generates also
// the code of the child node
secondpass(p^.left);
// setup the reference (this sets all values to nil, zero or
// R_NO)
clear_reference(p^.location.reference);
// now we have to distinguish the different locations where
// the child node could be stored
case p^.left^.location.loc of
LOC_REGISTER:
// LOC_REGISTER allows us to use simply the
// result register of the left node
p^.location.reference.base:=p^.left^.location.register;
LOC_CREGISTER:
begin
// we shouldn't destroy the result register of the
// result node, because it is a register variable
// so we allocate a register
hr:=getregister32;
// generate the loading instruction
emit_reg_reg(A_MOV,S_L,p^.left^.location.register,hr);
// setup the result location of the current node
p^.location.reference.base:=hr;
end;
LOC_MEM,LOC_REFERENCE:
begin
// first, we have to release the registers of
// the reference, before we can allocate
// register, del_reference release only the
// registers used by the reference,
// the contents of the registers isn't destroyed
del_reference(p^.left^.location.reference);
// now should be at least one register free, so we
// can allocate one for the base of the result
hr:=getregister32;
// generate dereferencing instruction
exprasmlist^.concat(new(pai386,op_ref_reg(
A_MOV,S_L,newreference(p^.left^.location.reference),
hr)));
// setup the location of the new created reference
p^.location.reference.base:=hr;
end;
end;
end;
\end{verbatim}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Binary nodes
\section{Binary nodes}
The whole thing becomes a little bit more hairy, if you have to
generate code for a binary+ node (a node with two or more
childs). If a node calls second pass for a child node,
it has to ensure that enough registers are free
to evalute the child node (\var{usableregs>=childnode\^.registers32}).
If this condition isn't true, the current node have
to store and restore all registers which the node does own to
release registers. This should be done using the
procedures \var{maybe\_push} and \var{restore}. If still
\var{usableregs<childnode\^.registers32}, the child nodes have to solve
the problem. The point is: if \var{usableregs<childnode\^.registers32},
the current node have to release all registers which it owns
before the second pass is called. An example for generating
code of a binary node is \var{cg386add.secondadd}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% FPU registers
\section{FPU registers}
The number of required FPU registers must be also calculated with
one difference: you needn't to save registers, if too few registers
are free, just an error message is generated, the user
have to take care of too few FPU registers, this is a consequence
of the stack structure of the FPU.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Testing register allocation
\section{Testing register allocation}
To test new stuff, you should compile a procedure which contains some local
longint variables with \file{-Or}, to limit the number of
registers:
\begin{verbatim}
procedure test;
var
l,i,j,k : longint;
begin
l:=i; // this forces the compiler to assign as much as
j:=k; // possible variables to registers
// here you should insert your code
end;
\end{verbatim}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Future plans
\section{Future plans}
\label{se:future_plans}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Coding style guide
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{Coding style guide}
This chapter describes what you should consider if you modify the
compiler sources.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The formatting of the source
\section{The formatting of the sources}
Rules how to format the sources.
\begin{itemize}
\item All compiler files should be saved in UNIX format i.e. only
a line feed (\#10), no carrige return (\#13).
\item Don't use tabs
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Some hints how to write the code
\section{Some hints how to write the code}
\begin{itemize}
\item Assigned should be used instead of checking for nil directly, as
it can help solving pointer problems when in real mode.
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Compiler Defines
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{Compiler Defines}
The compiler can be configured using command line defines, the
basic set is decribed here, switches which change rapidly or
which are only used temporarly are described in the header
of \file{PP.PAS}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Target Processor
\section{Target processor}
The target processor must be set always and it can be:
\begin{description}
\item [\var{I386}] for Intel 32 bit processors of the i386 class
\item [\var{M68K}] for Motorola processors of the 68000 class
\end{description}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Include compiler Parts
\section{Include compiler Parts}
\subsection{General}
\begin{description}
\item[\var{GDB}] include GDB stab debugging (\file{-g}) support
\item[\var{UseBrowser}] include Browser (\file{-b}) support
\end{description}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Leave Out specific Parts
\section{Leave Out specific Parts}
Leaving of parts of the compiler is useful, if you want to create
a compiler which should also run on systems with less memory
requirements (for example a real mode version compiled with Turbo Pascal).
\subsection{General}
\begin{description}
\item[\var{NoOpt}] will leave out the optimizer
\end{description}
\subsection{I386 specific}
The following defines apply only to the i386 version of the compiler.
\begin{description}
\item[\var{NoAg386Int}] No Intel styled assembler (for the MASM/TASM) writer
\item[\var{NoAg386Nsm}] No NASM assembler writer
\item[\var{NoAg386Att}] No AT\&T assembler (for the GNU AS) writer
\item[\var{NoRA386Int}] No Intel assembler parser
\item[\var{NoRA386Dir}] No direct assembler parser
\item[\var{NoRA386Att}] No AT\&T assembler parser
\end{description}
\subsection{M68k specific}
The following defines apply only to the M68k version of the compiler.
\begin{description}
\item[\var{NoAg68kGas}] No gas asm writer
\item[\var{NoAg68kMit}] No mit asm writer
\item[\var{NoAg68kMot}] No mot asm writer
\item[\var{NoRA68kMot}] No Motorola assembler parser
\end{description}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Location of the code generator functions
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{Location of the code generator functions}
This appendix describes where to find the functions of
the code generator. The file names are given for the
i386, for the m68k rename the 386 to 68k
\begin{description}
\item[\file{cg386con}] Constant generation
\begin{description}
\item[\var{secondordconst}]
\item[\var{secondrealconst}]
\item[\var{secondstringconst}]
\item[\var{secondfixconst}]
\item[\var{secondsetconst}]
\item[\var{secondniln}]
\end{description}
\item[\file{cg386mat}] Mathematic functions
\begin{description}
\item[\var{secondmoddiv}]
\item[\var{secondshlshr}]
\item[\var{secondumminus}]
\item[\var{secondnot}]
\end{description}
\item[\file{cg386cnv}] Type conversion functions
\begin{description}
\item[\var{secondtypeconv}]
\item[\var{secondis}]
\item[\var{secondas}]
\end{description}
\item[\file{cg386add}] Add/concat functions
\begin{description}
\item[\var{secondadd}]
\end{description}
\item[\file{cg386mem}] Memory functions
\begin{description}
\item[\var{secondvecn}]
\item[\var{secondaddr}]
\item[\var{seconddoubleaddr}]
\item[\var{secondsimplenewdispose}]
\item[\var{secondhnewn}]
\item[\var{secondhdisposen}]
\item[\var{secondselfn}]
\item[\var{secondwith}]
\item[\var{secondloadvmt}]
\item[\var{secondsubscriptn}]
\item[\var{secondderef}]
\end{description}
\item[\file{cg386flw}] Flow functions
\begin{description}
\item[\var{secondifn}]
\item[\var{second\_while\_repeatn}]
\item[\var{secondfor}]
\item[\var{secondcontinuen}]
\item[\var{secondbreakn}]
\item[\var{secondexitn}]
\item[\var{secondlabel}]
\item[\var{secondgoto}]
\item[\var{secondtryfinally}]
\item[\var{secondtryexcept}]
\item[\var{secondraise}]
\item[\var{secondfail}]
\end{description}
\item[\file{cg386ld}] Load/Store functions
\begin{description}
\item[\var{secondload}]
\item[\var{secondassignment}]
\item[\var{secondfuncret}]
\end{description}
\item[\file{cg386set}] Set functions
\begin{description}
\item[\var{secondcase}]
\item[\var{secondin}]
\end{description}
\item[\file{cg386cal}] Call/inline functions
\begin{description}
\item[\var{secondparacall}]
\item[\var{secondcall}]
\item[\var{secondprocinline}]
\item[\var{secondinline}]
\end{description}
\item[\file{cgi386}] Main secondpass handling
\begin{description}
\item[\var{secondnothing}]
\item[\var{seconderror}]
\item[\var{secondasm}]
\item[\var{secondblockn}]
\item[\var{secondstatement}]
\end{description}
\end{description}
\end{document}