% % $Id$ % This file is part of the FPC documentation. % Copyright (C) 1997, by Michael Van Canneyt % % The FPC documentation is free text; you can redistribute it and/or % modify it under the terms of the GNU Library General Public License as % published by the Free Software Foundation; either version 2 of the % License, or (at your option) any later version. % % The FPC Documentation is distributed in the hope that it will be useful, % but WITHOUT ANY WARRANTY; without even the implied warranty of % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU % Library General Public License for more details. % % You should have received a copy of the GNU Library General Public % License along with the FPC documentation; see the file COPYING.LIB. If not, % write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, % Boston, MA 02111-1307, USA. % \documentclass{report} \usepackage{a4} \usepackage{html} \latex{\usepackage{multicol}} \latex{\usepackage{fpcman}} \html{\input{fpc-html.tex}} % define the version number here, and not in the fpc.sty !!! \newcommand{\remark}[1]{\par$\rightarrow$\textbf{#1}\par} \newcommand{\olabel}[1]{\label{option:#1}} % We should change this to something better. See \seef etc. \begin{document} \title{Free Pascal \\ Programmers' manual} \docdescription{Programmers' manual for \fpc, version \fpcversion} \docversion{1.4} \date{July 1998} \author{Micha\"el Van Canneyt} \maketitle \tableofcontents \newpage %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Introduction %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section*{About this document} This is the programmer's manual for \fpc. It describes some of the peculiarities of the \fpc compiler, and provides a glimpse of how the compiler generates its code, and how you can change the generated code. It will not, however, provide you with a detailed account of the inner workings of the compiler, nor will it tell you how to use the compiler (described in the \userref). It also will not describe the inner workings of the Run-Time Library (RTL). The best way to learn about the way the RTL is implemented is from the sources themselves. The things described here are useful if you want to do things which need greater flexibility than the standard Pascal language constructs. (described in the \refref) Since the compiler is continuously under development, this document may get out of date. Wherever possible, the information in this manual will be updated. If you find something which isn't correct, or you think something is missing, feel free to contact me\footnote{at \var{michael@tfdec1.fys.kuleuven.ac.be}}. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Compiler switches %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Compiler directives} \label{ch:CompSwitch} \fpc supports compiler directives in your source file. They are not the same as Turbo Pascal directives, although some are supported for compatibility. There is a distinction between local and global directives; local directives take effect from the moment they are encountered, global directives have an effect on all of the compiled code. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Local switches \section{Local directives} \label{se:LocalSwitch} Local directives have no command-line counterpart. They influence the compiler's behaviour from the moment they're encountered until the moment another switch annihilates their behaviour, or the end of the unit or program is reached. \subsection{\var{\$F} : Far or near functions} This directive is recognized for compatibility with Turbo Pascal. Under the 32-bit programming model, the concept of near and far calls have no meaning, hence the directive is ignored. A warning is printed to the screen, telling you so. As an example, : the following piece of code : \begin{verbatim} {$F+} Procedure TestProc; begin Writeln ('Hello From TestProc'); end; begin testProc end. \end{verbatim} Generates the following compiler output: \begin{verbatim} malpertuus: >pp -vw testf Compiler: ppc386 Units are searched in: /home/michael;/usr/bin/;/usr/lib/ppc/0.9.1/linuxunits Target OS: Linux Compiling testf.pp testf.pp(1) Warning: illegal compiler switch 7739 kB free Calling assembler... Assembled... Calling linker... 12 lines compiled, 1.00000000000000E+0000 \end{verbatim} You can see that the verbosity level was set to display warnings. If you declare a function as \var{Far} (this has the same effect as setting it between \var{\{\$F+\}...\{\$F-\}} directives), the compiler also generates a warning : \begin{verbatim} testf.pp(3) Warning: FAR ignored \end{verbatim} The same story is true for procedures declared as \var{Near}. The warning displayed in that case is: \begin{verbatim} testf.pp(3) Warning: NEAR ignored \end{verbatim} \subsection{\var{\$I} : Input/Output checking} The \var{\{\$I-\}} directive tells the compiler not to generate input/output checking code in your program. If you compile using the \var{-Ci} compiler switch, the \fpc compiler inserts input/output checking code after every input/output call in your program. If an error occurred during input or output, then a run-time error will be generated. Use this switch if you wish to avoid this behavior. If you still want to check if something went wrong, you can use the \var{IOResult} function to see if everything went without problems. Conversely, \var{\{\$I+\}} will turn error-checking back on, until another directive is encountered which turns it off again. The most common use for this switch is to check if the opening of a file went without problems, as in the following piece of code: \begin{verbatim} ... assign (f,'file.txt'); {$I-} rewrite (f); {$I+} if IOResult<>0 then begin Writeln ('Error opening file : "file.txt"'); exit end; ... \end{verbatim} \subsection{\var{\$I} : Include file } The \var{\{\$I filename\}} directive tells the compiler to read further statements from the file \var{filename}. The statements read there will be inserted as if they occurred in the current file. The compiler will append the \file{.pp} extension to the file if you don't specify an extension yourself. Do not put the filename between quotes, as they will be regarded as part of the file's name. You can nest included files, but not infinitely deep. The number of files is restricted to the number of file descriptors available to the \fpc compiler. Contrary to Turbo Pascal, include files can cross blocks. I.e. you can start a block in one file (with a \var{Begin} keyword) and end it in another (with a \var{End} keyword). The smallest entity in an include file must be a token, i.e. an identifier, keyword or operator. \subsection{\var{\$L} : Link object file} The \var{\{\$L filename\}} directive tells the compiler that the file \file{filename} should be linked to your program. You can only use this directive in a program. If you do use it in a unit, the compiler will not complain, but simply ignores the directive. The compiler will {\em not} look for the file in the unit path. The name will be passed to the linker {\em exactly} as you've typed it. Since the files name is passed directly to the linker, this means that on \linux systems, the name is case sensitive, and must be typed exactly as it appears on your system. {\em Remark :} Take care that the object file you're linking is in a format the linker understands. Which format this is, depends on the platform you're on. Typing \var{ld} on the command line gives a list of formats \var{ld} knows about. You can pass other files and options to the linker using the \var{-k} command-line option. You can specify more than one of these options, and they will be passed to the linker, in the order that you specified them on the command line, just before the names of the object files that must be linked. % Assembler type \subsection{\var{\$I386\_XXX} : Specify assembler format (Intel x86 only)} This switch informs the compiler what kind of assembler it can expect in an \var{asm} block. The \var{XXX} should be replaced by one of the following: \begin{description} \item [att\ ] Indicates that \var{asm} blocks contain AT\&T syntax assembler. \item [intel\ ] Indicates that \var{asm} blocks contain Intel syntax assembler. \item [direct\ ] Tells the compiler that asm blocks should be copied directly to the assembler file. \end{description} These switches are local, and retain their value to the end of the unit that is compiled, unless they are replaced by another directive of the same type. The command-line switch that corresponds to this switch is \var{-R}. \subsection{\var{\$MMX} : MMX support (Intel x86 only)} As of version 0.9.8, \fpc supports optimization for the \textbf{MMX} Intel processor (see also \ref{ch:MMXSupport}). This optimizes certain code parts for the \textbf{MMX} Intel processor, thus greatly improving speed. The speed is noticed mostly when moving large amounts of data. Things that change are \begin{itemize} \item Data with a size that is a multiple of 8 bytes is moved using the \var{movq} assembler instruction, which moves 8 bytes at a time \end{itemize} When \textbf{MMX} support is on, you aren't allowed to do floating point arithmetic. You are allowed to move floating point data, but no arithmetic can be done. If you wish to do floating point math anyway, you must first switch of \textbf{MMX} support and clear the FPU using the \var{emms} function of the \file{cpu} unit. The following example will make this more clear: \begin{verbatim} Program MMXDemo; uses cpu; var d1 : double; a : array[0..10000] of double; i : longint; begin d1:=1.0; {$mmx+} { floating point data is used, but we do _no_ arithmetic } for i:=0 to 10000 do a[i]:=d2; { this is done with 64 bit moves } {$mmx-} emms; { clear fpu } { now we can do floating point arithmetic } .... end. \end{verbatim} See, however, the chapter on MMX (\ref{ch:MMXSupport}) for more information on this topic. \subsection{\var{\$OUTPUT\_FORMAT} : Specify the output format} \var{\{\$OUTPUT\_FORMAT format\}} has the same functionality as the \var{-A} command-line option : It tells the compiler what kind of object file must be generated. You can specify this switch \textbf{only} befor the \var{Program} or \var{Unit} clause in your source file. The different kinds of formats are shown in \seet{Formats}. \begin{FPCltable}{ll}{Formats generated by the x86 compiler}{Formats} \hline Switch value & Generated format \\ \hline att & AT\&T assembler file. \\ o & Unix object file.\\ obj & OMF file.\\ wasm & assembler for the Watcom assembler. \\ \hline \end{FPCltable} \subsection{\var{\$V} : Var-string checking} When in the \var{+} state, the compiler checks that strings passed as parameters are of the same, identical, string type as the declared parameters of the procedure. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Global switches \section{Global directives} \label{se:GlobalSwitch} Global directives affect the whole of the compilation process. That is why they also have a command - line counterpart. The command-line counterpart is given for each of the directives. \subsection{\var{\$A} : Align Data} This switch is recognized for Turbo Pascal Compatibility, but is not yet implemented. The alignment of data will be different in any case, since \fpc is a 32-bit compiler. \subsection{\var{\$B} : Complete boolean evaluation} This switch is understood by the \fpc compiler, but is ignored. The compiler always uses shortcut evaluation, i.e. the evaluation of a boolean expression is stopped once the result of the total exression is known with certainty. So, in the following example, the function \var{Bofu}, which has a boolean result, will never get called. \begin{verbatim} If False and Bofu then ... \end{verbatim} \subsection{\var{\$D} : Debugging symbols} When this switch is on, the compiler inserts GNU debugging information in the executable. The effect of this switch is the same as the command-line switch \var{-g}. By default, insertion of debugging information is off. \subsection{\var{\$E} : Emulation of coprocessor} This directive controls the emulation of the coprocessor. There is no command-line counterpart for this directive. \subsubsection{ Intel x86 version } When this switch is enabled, all floating point instructions which are not supported by standard coprocessor emulators will give out a warning. The compiler itself doesn't do the emulation of the coprocessor. To use coprocessor emulation under \dos go32v1 there is nothing special required, as it is handled automatically. To use coprocessor emulation under \dos go32v2 you must use the emu387 unit, which contains correct initialization code for the emulator. Under \linux, the kernel takes care of the coprocessor support. \subsubsection{ Motorola 680x0 version } When the switch is on, no floating point opcodes are emitted by the code generator. Instead, internal run-time library routines are called to do the necessary calculations. In this case all real types are mapped to the single IEEE floating point type. \emph{ Remark : } By default, emulation is on. It is possible to intermix emulation code with real floating point opcodes, as long as the only type used is single or real. \subsection{\var{\$G} : Generate 80286 code} This option is recognised for Turbo Pascal compatibility, but is ignored, \subsection{\var{\$L} : Local symbol information} This switch (not to be confused with the \var{\{\$L file\}} file linking directive) is recognised for Turbo Pascal compatibility, but is ignored. generation of symbol information is controlled by the \var{\$D} switch. \subsection{\var{\$N} : Numeric processing } This switch is recognised for Turbo Pascal compatibility, but is otherwise ignored, since the compiler always uses the coprocessor for floating point mathematics. \subsection{\var{\$O} : Overlay code generation } This switch is recognised for Turbo Pascal compatibility, but is otherwise ignored. \subsection{\var{\$Q} : Overflow checking} The \var{\{\$Q+\}} directive turns on integer overflow checking. This means that the compiler inserts code to check for overflow when doing computations with an integer. When an overflow occurs, the run-time library will print a message \var{Overflow at xxx}, and exit the program with exit code 215. \emph{ Remark: } Overflow checking behaviour is not the same as in Turbo Pascal since all arithmetic operations are done via 32-bit values. Furthermore, the Inc() and Dec() standard system procedures \emph{ are } checked for overflow in \fpc, while in Turbo Pascal they are not. Using the \var{\{\$Q-\}} switch switches off the overflow checking code generation. The generation of overflow checking code can also be controlled using the \var{-Co} command line compiler option (see \userref). \subsection{\var{\$R} : Range checking} By default, the computer doesn't generate code to check the ranges of array indices, enumeration types, subrange types, etc. Specifying the \var{\{\$R+\}} switch tells the computer to generate code to check these indices. If, at run-time, an index or enumeration type is specified that is out of the declared range of the compiler, then a run-time error is generated, and the program exits with exit code 201. The \var{\{\$R-\}} switch tells the compiler not to generate range checking code. This may result in faulty program behaviour, but no run-time errors will be generated. {\em Remark: } Range checking for sets and enumerations are not yet fully implemented. \subsection{\var{\$S} : Stack checking} The \var{\{\$S+\}} directive tells the compiler to generate stack checking code. This generates code to check if a stack overflow occurred, i.e. to see whether the stack has grown beyond its maximally allowed size. If the stack grows beyond the maximum size, then a run-time error is generated, and the program will exit with exit code 202. Specifying \var{\{\$S-\}} will turn generation of stack-checking code off. The command-line compiler switch \var{-Ct} has the same effect as the \var{\{\$S+\}} directive. \subsection{\var{\$X} : Extended syntax} Extended syntax allows you to drop the result of a function. This means that you can use a function call as if it were a procedure. Standard this feature is on. You can switch it off using the \var{\{\$X-\}} directive. The following, for instance, will not compile : \begin{verbatim} function Func (var Arg : sometype) : longint; begin ... { declaration of Func } end; ... {$X-} Func (A); \end{verbatim} The reason this construct is supported is that you may wish to call a function for certain side-effects it has, but you don't need the function result. In this case you don't need to assign the function result, saving you an extra variable. The command-line compiler switch \var{-Sa1} has the same effect as the \var{\{\$X+\}} directive. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Using conditionals and macros %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Using conditionals, Messages and macros} \label{ch:CondMessageMacro} The \fpc compiler supports conditionals as in normal Turbo Pascal. It does, however, more than that. It allows you to make macros which can be used in your code, and it allows you to define messages or errors which will be displayed when compiling. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Conditionals \section{Conditionals} \label{se:Conditionals} The rules for using conditional symbols are the same as under Turbo Pascal. Defining a symbol goes as follows: \begin{verbatim} {$Define Symbol } \end{verbatim} From this point on in your code, the compiler know the symbol \var{Symbol} Symbols are, like the Pascal language, case insensitive. You can also define a symbol on the command line. the \var{-dSymbol} option defines the symbol \var{Symbol}. You can specify as many symbols on the command line as you want. Undefining an existing symbol is done in a similar way: \begin{verbatim} {$Undef Symbol } \end{verbatim} If the symbol didn't exist yet, this doesn't do anything. If the symbol existed previously, the symbol will be erased, and will not be recognized any more in the code following the \verb|{$Undef ...}| statement. You can also undefine symbols from the command line with the \var{-u} command-line switch.. To compile code conditionally, depending on whether a symbol is defined or not, you can enclose the code in a \verb|{$ifdef Symbol}| .. \verb|{$endif}| pair. For instance the following code will never be compiled : \begin{verbatim} {$Undef MySymbol} {$ifdef Mysymbol} DoSomething; ... {$endif} \end{verbatim} Similarly, you can enclose your code in a \verb|{$Ifndef Symbol}| .. \verb|{$endif}| pair. Then the code between the pair will only be compiled when the used symbol doesn't exist. For example, in the following example, the call to the \var{DoSomething} will always be compiled: \begin{verbatim} {$Undef MySymbol} {$ifndef Mysymbol} DoSomething; ... {$endif} \end{verbatim} You can combine the two alternatives in one structure, namely as follows \begin{verbatim} {$ifdef Mysymbol} DoSomething; {$else} DoSomethingElse {$endif} \end{verbatim} In this example, if \var{MySymbol} exists, then the call to \var{DoSomething} will be compiled. If it doesn't exist, the call to \var{DoSomethingElse} is compiled. The \fpc compiler defines some symbols before starting to compile your program or unit. You can use these symbols to differentiate between different versions of the compiler, and between different compilers. In \seet{Symbols}, a list of pre-defined symbols is given\footnote{Remark: The \var{FPK} symbol is still defined for compatibility with older versions.}. In that table, you should change \var{v} with the version number of the compiler you're using, \var{r} with the release number and \var{p} with the patch-number of the compiler. 'OS' needs to be changed by the type of operating system. Currently this can be one of \var{DOS}, \var{GO32V2}, \var{LINUX}, \var{OS2}, \var{WIN32}, \var{MACOS}, \var{AMIGA} or \var{ATARI}. This symbol is undefined if you specify a target that is different from the platform you're compiling on. the \var{-TSomeOS} option on the command line will define the \var{SomeOS} symbol, and will undefined the existing platform symbol\footnote{In versions prior to 0.9.4, this didn't happen, thus making Cross-compiling impossible.}. \begin{FPCltable}{c}{Symbols defined by the compiler.}{Symbols} \hline Free \\ VER\var{v} \\ VER\var{v}\_\var{r} \\ VER\var{v}\_\var{r}\_\var{p} \\ OS \\ \hline \end{FPCltable} As an example : Version 0.9.1 of the compiler, running on a Linux system, defines the following symbols before reading the command line arguments: \var{FPC}, \var{VER0}, \var{VER0\_9}, \var{VER0\_9\_1} and \var{LINUX}. Specifying \var{-TOS2} on the command-line will undefine the \var{LINUX} symbol, and will define the \var{OS2} symbol. {\em Remark: } Symbols, even when they're defined in the interface part of a unit, are not available outside that unit. \fpc supports the \var{\{\$IFOPT \}} directive for Turbo Pascal compatibility, but doesn't act on it. It always rejects the condition, so code between \var{\{\$IFOPT \}} and \var{\{\$Endif\}} is never compiled. Except for the Turbo Pascal constructs, from version 0.9.8 and higher, the \fpc compiler also supports a stronger conditional compile mechanism: The \var{\{\$If \}} construct. The prototype of this construct is as follows : \begin{verbatim} {$If expr} CompileTheseLines; {$else} BetterCompileTheseLines; {$endif} \end{verbatim} In this directive \var{expr} is a Pascal expression which is evaluated using strings, unless both parts of a comparision can be evaluated as numbers, in which case they are evaluated using numbers\footnote{Otherwise \var{\{\$If 8>54} would evaluate to \var{True}}. If the complemete expression evaluates to \var{'0'}, then it is considered false and rejected. Otherwise it is considered true and accepted. This may have unsexpected consequences : \begin{verbatim} {$If 0} \end{verbatim} Will evaluate to \var{False} and be rejected, while \begin{verbatim} {$If 00} \end{verbatim} Will evaluate to \var{True}. You can use any Pascal operator to construct your expression : \var{=, <>, >, <, >=, <=, AND, NOT, OR} and you can use round brackets to change the precedence of the operators. The following example shows you many of the possibilities: \begin{verbatim} {$ifdef fpc} var y : longint; {$else fpc} var z : longint; {$endif fpc} var x : longint; begin {$if (fpc_version=0) and (fpc_release>6) and (fpc_patch>4)} {$info At least this is version 0.9.5} {$else} {$fatalerror Problem with version check} {$endif} {$define x:=1234} {$if x=1234} {$info x=1234} {$else} {$fatalerror x should be 1234} {$endif} {$if 12asdf and 12asdf} {$info $if 12asdf and 12asdf is ok} {$else} {$fatalerror $if 12asdf and 12asdf rejected} {$endif} {$if 0 or 1} {$info $if 0 or 1 is ok} {$else} {$fatalerror $if 0 or 1 rejected} {$endif} {$if 0} {$fatalerror $if 0 accepted} {$else} {$info $if 0 is ok} {$endif} {$if 12=12} {$info $if 12=12 is ok} {$else} {$fatalerror $if 12=12 rejected} {$endif} {$if 12<>312} {$info $if 12<>312 is ok} {$else} {$fatalerror $if 12<>312 rejected} {$endif} {$if 12<=312} {$info $if 12<=312 is ok} {$else} {$fatalerror $if 12<=312 rejected} {$endif} {$if 12<312} {$info $if 12<312 is ok} {$else} {$fatalerror $if 12<312 rejected} {$endif} {$if a12=a12} {$info $if a12=a12 is ok} {$else} {$fatalerror $if a12=a12 rejected} {$endif} {$if a12<=z312} {$info $if a12<=z312 is ok} {$else} {$fatalerror $if a12<=z312 rejected} {$endif} {$if a12$7fff becomes $ffff) } audio1:=(audio1+helpdata2)-helpdata2; {$saturation-} { now mupltily with 2 and change to integer } audio1:=(audio1 shl 1)-helpdata2; {$mmx-} end. \end{verbatim} \section{Restrictions of MMX support} \label{se:MMXrestrictions} In the beginning of 1997 the MMX instructions were introduced in the Pentium processors, so multitasking systems wouldn't save the newly introduced MMX registers. To work around that problem, Intel mapped the MMX registers to the FPU register. The consequence is that you can't mix MMX and floating point operations. After using MMX operations and before using floating point operations, you have to call the routine \var{EMMS} of the \var{MMX} unit. This routine restores the FPU registers. {\em careful:} The compiler doesn't warn, if you mix floating point and MMX operations, so be careful. The MMX instructions are optimized for multi media (what else?). So it isn't possible to perform each operation, some opertions give a type mismatch, see section \ref {se:SupportedMMX} for the supported MMX operations An important restriction is that MMX operations aren't range or overflow checked, even when you turn range and overflow checking on. This is due to the nature of MMX operations. The \var{MMX} unit must be always used when doing MMX operations because the exit code of this unit clears the MMX unit. If it wouldn't do that, other program will crash. A consequence of this is that you can't use MMX operations in the exit code of your units or programs, since they would interfere with the exit code of the \var{MMX} unit. The compiler can't check this, so you are responsible for this ! \section{Supported MMX operations} \label{se:SupportedMMX} \section{Optimizing MMX support} \label{se:OptimizingMMX} Here are some helpful hints to get optimal performance: \begin{itemize} \item The \var{EMMS} call takes a lot of time, so try to seperate floating point and MMX operations. \item Use MMX only in low level routines because the compiler saves all used MMX registers when calling a subroutine. \item The NOT-operator isn't supported natively by MMX, so the compiler has to generate a workaround and this operation is inefficient. \item Simple assignements of floating point numbers don't access floating point registers, so you need no call to the \var{EMMS} procedure. Only when doing arithmetic, you need to call the \var{EMMS} procedure. \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Memory issues %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Memory issues} \label{ch:Memory} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % The 32-bit model \section{The 32-bit model.} \label{se:ThirtytwoBit} The \fpc Pascal compiler issues 32-bit code. This has several consequences: \begin{itemize} \item You need a 386 processor to run the generated code. The compiler functions on a 286 when you compile it using Turbo Pascal, but the generated programs cannot be assembled or executed. \item You don't need to bother with segment selectors. Memory can be addressed using a single 32-bit pointer. The amount of memory is limited only by the available amount of (virtual) memory on your machine. \item The structures you define are unlimited in size. Arrays can be as long as you want. You can request memory blocks from any size. \end{itemize} The fact that 32-bit code is used, means that some of the older Turbo Pascal constructs and functions are obsolete. The following is a list of functions which shouldn't be used anymore: \begin{description} \item [Seg()] : Returned the segment of a memory address. Since segments have no more meaning, zero is returned in the \fpc run-time library implementation of \var{Seg}. \item [Ofs()] : Returned the offset of a memory address. Since segments have no more meaning, the complete address is returned in the \fpc implementation of this function. This has as a consequence that the return type is \var{Longint} instead of \var{Word}. \item [Cseg(), Dseg()] : Returned, respectively, the code and data segments of your program. This returns zero in the \fpc implementation of the system unit, since both code and data are in the same memory space. \item [Ptr] accepted a segment and offset from an address, and would return a pointer to this address. This has been changed in the run-time library. Standard it returns now simply the offset. If you want to retain the old functionality, you can recompile the run-time library with the \var{DoMapping} symbol defined. This will restore the Turbo Pascal behaviour. \item [memw and mem] these arrays gave access to the \dos memory. \fpc supports them, they are mapped into \dos memory space. You need the \var{GO32} unit for this. \end{description} You shouldn't use these functions, since they are very non-portable, they're specific to \dos and the ix86 processor. The \fpc compiler is designed to be portable to other platforms, so you should keep your code as portable as possible, and not system specific. That is, unless you're writing some driver units, of course. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % The stack \section{The stack} \label{se:Stack} The stack is used to pass parameters to procedures or functions, to store local variables, and, in some cases, to return function results. When a function or procedure is called, then the following is done by the compiler : \begin{enumerate} \item If there are any parameters to be passed to the procedure, they are pushed from right to left on the stack. \item If a function is called that returns a variable of type \var{String}, \var{Set}, \var{Record}, \var{Object} or \var{Array}, then an address to store the function result in, is pushed on the stack. \item If the called procedure or function is an object method, then the pointer to \var{self} is pushed on the stack. \item If the procedure or function is nested in another function or procedure, then the frame pointer of the parent procedure is pushed on the stack. \item The return address is pushed on the stack (This is done automatically by the instruction which calls the subroutine). \end{enumerate} The resulting stack frame upon entering looks as in \seet{StackFrame}. \begin{FPCltable}{llc}{Stack frame when calling a procedure}{StackFrame} \hline Offset & What is stored & Optional ? \\ \hline +x & parameters & Yes \\ +12 & function result & Yes \\ +8 & self & Yes \\ +4 & Frame pointer of parent procedure & Yes \\ +0 & Return address & No\\ \hline \end{FPCltable} \subsection{ Intel x86 version } The stack is cleared with the \var{ret} I386 instruction, meaning that the size of all pushed parameters is limited to 64K. \subsubsection{ DOS } Under the DOS targets , the default stack is set to 256Kb. This value cannot be modified for the GO32V1 target. But this can be modified with the GO32V2 target using a special DJGPP utility \var{stubedit}. It is to note that the stack size may be changed with some compiler switches, this stack size, if \emph{greater} then the default stack size will be used instead, otherwise the default stack size is used. \subsubsection{ Linux } Under Linux, stack size is only limited by the available memory by the system. \subsubsection{ OS/2 } Under OS/2, stack size is determined by one of the runtime environment variables set for EMX. Therefore, the stack size is user defined. \subsection{ Motorola 680x0 version } All depending on the processor target, the stack can be cleared in two manners, if the target processor is a MC68020 or higher, the stack will be cleared with a simple \var{rtd} instruction, meaning that the size of all pushed parameters is limited to 32K. Otherwise on MC68000/68010 processors, the stack clearing mechanism is sligthly more complicated, the exit code will look like this: \begin{verbatim} { move.l (sp)+,a0 add.l paramsize,a0 move.l a0,-(sp) rts } \end{verbatim} \subsubsection{ Amiga } Under AmigaOS, stack size is determined by the user, which sets this value using the stack program. Typical sizes range from 4K to 40K. \subsubsection{ Atari } Under Atari TOS, stack size is currently limited to 8K, and it cannot be modified. This may change in a future release of the compiler. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % The heap \section{The heap} \label{se:Heap} The heap is used to store all dynamic variables, and to store class instances. The interface to the heap is the same as in Turbo Pascal, although the effects are maybe not the same. On top of that, the \fpc run-time library has some extra possibilities, not available in Turbo Pascal. These extra possibilities are explained in the next subsections. % The heap grows \subsection{The heap grows} \fpc supports the \var{HeapEerror} procedural variable. If this variable is non-nil, then it is called in case you try to allocate memory, and the heap is full. By default, \var{HeapError} points to the \var{GrowHeap} function, which tries to increase the heap. The growheap function issues a system call to try to increase the size of the memory available to your program. It first tries to increase memory in a 1 Mb. chunk. If this fails, it tries to increase the heap by the amount you requested from the heap. If the call to \var{GrowHeap} has failed, then a run-time error is generated, or nil is returned, depending on the \var{GrowHeap} result. If the call to \var{GrowHeap} was successful, then the needed memory will be allocated. % Using Blocks \subsection{Using Blocks} If you need to allocate a lot of small block for a small period, then you may want to recompile the run-time library with the \var{USEBLOCKS} symbol defined. If it is recompiled, then the heap management is done in a different way. The run-time library keeps a linked list of allocated blocks with size up to 256 bytes\footnote{The size can be set using the \var{max\_size} constant in the \file{heap.inc} source file.}. By default, it keeps 32 of these lists\footnote{The actual size is \var{max\_size div 8}.}. When a piece of memory in a block is deallocated, the heap manager doesn't really deallocate the occupied memory. The block is simply put in the linked list corresponding to its size. When you then again request a block of memory, the manager checks in the list if there is a non-allocated block which fits the size you need (rounded to 8 bytes). If so, the block is used to allocate the memory you requested. This method of allocating works faster if the heap is very fragmented, and you allocate a lot of small memory chunks. Since it is invisible to the program, this provides an easy way of improving the performance of the heap manager. % The splitheap \subsection{Using the split heap} {\em Remark : The split heap is still somewhat buggy. Use at your own risk for the moment.} The split heap can be used to quickly release a lot of blocks you alloated previously. Suppose that in a part of your program, you allocate a lot of memory chunks on the heap. Suppose that you know that you'll release all this memory when this particular part of you program is finished. In Turbo Pascal, you could foresee this, and mark the position of the heap (using the \var{Mark} function) when entering this particular part of your program, and release the occupied memory in one call with the \var{Release} call. For most purposes, this works very good. But sometimes, you may need to allocate something on the heap that you {\em don't} want deallocated when you release the allocated memory. That is where the split heap comes in. When you split the heap, the heap manager keeps 2 heaps: the base heap (the normal heap), and the temporary heap. After the call to split the heap, memory is allocated from the temporary heap. When you're finished using all this memory, you unsplit the heap. This clears all the memory on the split heap with one call. After that, memory will be allocated from the base heap again. So far, nothing special, nothing that can't be done with calls to \var{mark} and \var{release}. Suppose now that you have split the heap, and that you've come to a point where you need to allocate memory that is to stay allocated after you unsplit the heap again. At this point, mark and release are of no use. But when using the split heap, you can tell the heap manager to --temporarily-- use the base heap again to allocate memory. When you've allocated the needed memory, you can tell the heap manager that it should start using the temporary heap again. When you're finished using the temporary heap, you release it, and the memory you allocated on the base heap will still be allocated. To use the split-heap, you must recompile the run-time library with the \var{TempHeap} symbol defined. This means that the following functions are available : \begin{verbatim} procedure Split_Heap; procedure Switch_To_Base_Heap; procedure Switch_To_Temp_Heap; procedure Switch_Heap; procedure ReleaseTempHeap; procedure GetempMem(var p : pointer;size : longint); \end{verbatim} \var{split\_heap} is used to split the heap. It cannot be called two times in a row, without a call to \var{releasetempheap}. \var{Releasetempheap} completely releases the memory used by the temporary heap. Switching temporarily back to the base heap can be done using the \var{switch\_to\_base\_heap} call, and returning to the temporary heap is done using the \var{switch\_to\_temp\_heap} call. Switching from one to the other without knowing on which one your are right now, can be done using the \var{switch\_heap} call, which will split the heap first if needed. A call to \var{GetTempMem} will allocate a memory block on the temporary heap, whatever the current heap is. The current heap after this call will be the temporary heap. Typically, what will appear in your code is the following sequence : \begin{verbatim} Split_Heap ... { Memory allocation } ... { !! non-volatile memory needed !!} Switch_To_Base_Heap; getmem (P,size); Switch_To_Temp_Heap; ... {Memory allocation} ... ReleaseTempHeap; {All allocated memory is now freed, except for the memory pointed to by 'P' } ... \end{verbatim} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Accessing DOS memory under the GO32 extender \section{Accessing \dos memory under the Go32 extender (Intel x86 only) } \label{se:AccessingDosMemory} Because \fpc is a 32 bit compiler, and uses a \dos extender, accessing DOS memory isn't trivial. What follows is an attempt to an explanation of how to access and use \dos or real mode memory\footnote{Thanks to an explanation of Thomas schatzl (E-mail:\var{tom\_at\_work@geocities.com}).}. In {\em Proteced Mode}, memory is accessed through {\em Selectors} and {\em Offsets}. You can think of Selectors as the protected mode equivalents of segments. In \fpc, a pointer is an offset into the \var{DS} selector, which points to the Data of your program. To access the (real mode) \dos memory, somehow you need a selector that points to the \dos memory. The \file{GO32} unit provides you with such a selector: The \var{DosMemSelector} variable, as it is conveniently called. You can also allocate memory in \dos's memory space, using the \var{global\_dos\_alloc} function of the \file{GO32} unit. This function will allocate memory in a place where \dos sees it. As an example, here is a function that returns memory in real mode \dos and returns a selector:offset pair for it. \begin{verbatim} procedure dosalloc(var selector : word; var segment : word; size : longint); var result : longint; begin result := global_dos_alloc(size); selector := word(result); segment := word(result shr 16); end; \end{verbatim} (you need to free this memory using the \var{global\_dos\_free} function.) You can access any place in memory using a selector. You can get a selector using the \var{allocate\_ldt\_descriptor} function, and then let this selector point to the physical memory you want using the \var{set\_segment\_base\_address} function, and set its length using \var{set\_segment\_limit} function. You can manipulate the memory pointed to by the selector using the functions of the GO32 unit. For instance with the \var{seg\_fillchar} function. After using the selector, you must free it again using the \var{free\_ldt\_selector} function. More information on all this can be found in the \unitsref, the chapter on the \file{GO32} unit. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Optimizations done in the compiler %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Optimizations} \section{ Non processor specific } The following sections describe the general optimizations done by the compiler, they are non processor specific. Some of these require some compiler switch override while others are done automatically (those which require a switch will be noted as such). \subsection{ Constant folding } In \fpc, if the operand(s) of an operator are constants, they will be evaluated at compile time. Example \begin{verbatim} x:=1+2+3+6+5; will generate the same code as x:=17; \end{verbatim} Furthermore, if an array index is a constant, the offset will be evaluated at compile time. This means that accessing MyData[5] is as efficient as accessing a normal variable. Finally, calling \var{Chr}, \var{Hi}, \var{Lo}, \var{Ord}, \var{Pred}, or \var{Succ} functions with constant parameters generates no run-time library calls, instead, the values are evaluated at compile time. \subsection{ Constant merging } Using the same constant string two or more times generates only one copy of the string constant. \subsection{ Short cut evaluation } Evaluation of boolean expression stops as soon as the result is known, which makes code execute faster then if all boolean operands were evaluted. \subsection{ Constant set inlining } Using the \var{in} operator is always more efficient then using the equivalent <>, =, <=, >=, < and > operators. This is because range comparisons can be done more easily with \var{in} then with normal comparison operators. \subsection{ Small sets } Sets which contain less then 33 elements can be directly encoded using a 32-bit value, therefore no run-time library calls to evaluate operands on these sets are required; they are directly encoded by the code generator. \subsection{ Range checking } Assignments of constants to variables are range checked at compile time, which removes the need the generation of runtime range checking code. \emph{Remark:} This feature was not implemented before version 0.99.5 of \fpc. \subsection{ Shifts instead of multiply or divide } When one of the operands in a multiplication is a power of two, they are encoded using arithmetic shifts instructions, which generates more efficient code. Similarly, if the divisor in a \var{div} operation is a power of two, it is encoded using arithmetic shifts instructions. The same is true when accessing array indexes which are powers of two, the address is calculated using arithmetic shifts instead of the multiply instruction. \subsection{ Automatic alignment } By default all variables larger then a byte are guaranteed to be aligned at least on a word boundary. Furthermore all pointers allocated using the standard runtime library (\var{New} and \var{GetMem} among others) are guaranteed to return pointers aligned on a quadword boundary (64-bit alignment). Alignment of variables on the stack depends on the target processor. \emph{ Remark: } Quadword alignment of pointers is not guaranteed on systems which don't use an internal heap, such as for the Win32 target. \emph{ Remark: } Alignment is also done \emph{between} fields in records, objects and classes, this is \emph{not} the same as in Turbo Pascal and may cause problems when using disk I/O with these types. To get no alignment between fields use the \var{packed} directive or the \var{\{\$PackRecords n\}} switch. For further information, take a look at the reference manual under the \var{record} heading. \subsection{ Smart linking } This feature removes all unreferenced code in the final executable file, making the executable file much smaller. \emph{ Remark: } Smart linking was implemented starting with version 0.99.6 of \fpc. \subsection{ Inline routines } The following runtime library routines are coded directly into the final executable : \var{Lo}, \var{Hi}, \var{High}, \var{Sizeof}, \var{TypeOf}, \var{Length}, \var{Pred}, \var{Succ}, \var{Inc}, \var{Dec} and \var{Assigned}. \emph{ Remark: } Inline \var{Inc} and \var{Dec} were not completely implemented until version 0.99.6 of \fpc. \subsection{ Case optimization } When using the \var{-Oa} switch, case statements in certain cases will be decoded using a jump table, which in certain cases will make the case statement execute faster. \subsection{ Stack frame omission } When using the \var{-Ox} switch, under certain specific conditions, the stack frame (entry and exit code for the routine) will be omitted, and the variable will directly be accessed via the stack pointer. Conditions for omission of the stack frame : \begin{itemize} \item Routine does not call other routines \item Routine does not contain assembler statements \item Routine is not declared using the \var{Interrupt} directive \item Routine is not a constructor or destructor \end{itemize} \subsection{ Register variables } When using the \var{-Ox} switch, local variables or parameters which are used very often will be moved to registers for faster access. \emph{ Remark: } Register variable allocation is currently broken and should not be used. \subsection{ Intel x86 specific } Here follows a listing of the opimizing techniques used in the compiler: \begin{enumerate} \item When optimizing for a specific Processor (\var{-O3, -O4, -O5 -O6}, the following is done: \begin{itemize} \item In \var{case} statements, a check is done whether a jump table or a sequence of conditional jumps should be used for optimal performance. \item Determines a number of strategies when doing peephole optimization: \var{movzbl (\%ebp), \%eax} on PentiumPro and PII systems will be changed into \var{xorl \%eax,\%eax; movb (\%ebp),\%al } for lesser systems. \end{itemize} Cyrix \var{6x86} processor owners should optimize with \var{-O4} instead of \var{-O5}, because \var{-O5} leads to larger code, and thus to smaller speed, according to the Cyrix developers FAQ. \item When optimizing for speed (\var{-OG}) or size (\var{-Og}), a choice is made between using shorter instructions (for size) such as \var{enter \$4}, or longer instructions \var{subl \$4,\%esp} for speed. When smaller size is requested, things aren't aligned on 4-byte boundaries. When speed is requested, things are aligned on 4-byte boundaries as much as possible. \item Simple optimization (\var{-Oa}) makes sure the peephole optimizer is used, as well as the reloading optimizer. \item Uncertain optimizations (\var{-Oz}): With this switch, the reloading optimizer (enabled with \var{-Oa}) can be forced into making uncertain optimizations. You can enable uncertain optimizations only in certain cases, otherwise you will produce a bug; the following technical description tells you when to use them: \begin{quote} % Jonas's own words.. \em If uncertain optimizations are enabled, the reloading optimizer assumes that \begin{itemize} \item If something is written to a local/global register or a procedure/function parameter, this value doesn't overwrite the value to which a pointer points. \item If something is written to memory pointed to by a pointer variable, this value doesn't overwrite the value of a local/global variable or a procedure/function parameter. \end{itemize} % end of quote \end{quote} The practical upshot of this is that you cannot use the uncertain optimizations if you access any local or global variables through pointers. In theory, this includes \var{Var} parameters, but it is all right if you don't both read the variable once through its \var{Var} reference and then read it using it's name. The following example will produce bad code when you switch on uncertain optimizations: \begin{verbatim} Var temp: Longint; Procedure Foo(Var Bar: Longint); Begin If (Bar = temp) Then Begin Inc(Bar); If (Bar <> temp) then Writeln('bug!') End End; Begin Foo(Temp); End. \end{verbatim} The reason it produces bad code is because you access the global variable \var{Temp} both through its name \var{Temp} and through a pointer, in this case using the \var{Bar} variable parameter, which is nothing but a pointer to \var{Temp} in the above code. On the other hand, you can use the uncertain optimizations if you access global/local variables or parameters through pointers, and {\em only} access them through this pointer\footnote{ You can use multiple pointers to point to the same variable as well, that doesn't matter.}. For example: \begin{verbatim} Type TMyRec = Record a, b: Longint; End; PMyRec = ^TMyRec; TMyRecArray = Array [1..100000] of TMyRec; PMyRecArray = ^TMyRecArray; Var MyRecArrayPtr: PMyRecArray; MyRecPtr: PMyRec; Counter: Longint; Begin New(MyRecArrayPtr); For Counter := 1 to 100000 Do Begin MyRecPtr := @MyRecArrayPtr^[Counter]; MyRecPtr^.a := Counter; MyRecPtr^.b := Counter div 2; End; End. \end{verbatim} Will produce correct code, because the global variable \var{MyRecArrayPtr} is not accessed directly, but through a pointer (\var{MyRecPtr} in this case). In conclusion, one could say that you can use uncertain optimizations {\em only} when you know what you're doing. \end{enumerate} \subsection{ Motorola 680x0 specific } Using the \var{-O2} switch does several optimizations in the code produced, the most notable being: \begin{itemize} \item Sign extension from byte to long will use \var{EXTB} \item Returning of functions will use \var{RTD} \item Range checking will generate no run-time calls \item Multiplication will use the long \var{MULS} instruction, no runtime library call will be generated \item Division will use the long \var{DIVS} instruction, no runtime library call will be generated \end{itemize} \section{ Floating point } This is where can be found processor specific information on Floating point code generated by the compiler. \subsection{ Intel x86 specific } All normal floating point types map to their real type, including \var{comp} and \var{extended}. \subsection{ Motorola 680x0 specific } Early generations of the Motorola 680x0 processors did not have integrated floating point units, so to circumvent this fact, all floating point operations are emulated (when the \var{\$E+} switch ,which is the default) using the IEEE \var{Single} floating point type. In other words when emulation is on, Real, Single, Double and Extended all map to the \var{single} floating point type. When the \var{\$E} switch is turned off, normal 68882/68881/68040 floating point opcodes are emitted. The Real type still maps to \var{Single} but the other types map to their true floating point types. Only basic FPU opcodes are used, which means that it can work on 68040 processors correctly. \emph{ Remark: } \var{Double} and \var{Extended} types in true floating point mode have not been extensively tested as of version 0.99.5. \emph{ Remark: } The \var{comp} data type is currently not supported. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendices %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \appendix %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendix A %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Anatomy of a unit file} \label{ch:AppA} A unit file consists of basically five parts: \begin{enumerate} \item A unit header. \item A file references part. This contains the references to used units and sources with name, checksum and time stamps. \item A definition part. Contains all type and procedure definitions. \item A Symbol part. Contains all symbol names and references to their definitions. \item A list of units that are in the implementation part. \end{enumerate} The header consists of a sequence of 20 bytes, together they give some information about the unit file, the compiler version that was used to generate the unit file, etc. The complete layout can be found in \seet{UnitHeader}. The header is generated by the compiler, and changes only when the compiler changes. The current and up-to-date header definition can be found in the \file{files.pas} source file of the compiler. Look in this file for the \var{unitheader} constant declaration. \begin{FPCltable}{ll}{Unit header structure.}{UnitHeader} \hline Byte & What is stored \\ \hline 0..3 & The letters 'PPU' in upper case. This acts as a check. \\ 4..6 & The unit format as a 3 letter sequence : e.g. '0','1,'2' for format 12. \\ 7,8 & The compiler version and release numbers as bytes. \\ 9 & The target OS number. \\ 10 & Unit flags.\\ 11..14 & Checksum (as a longint). \\ 15,16 & unused (equal to 255). \\ 17..20 & Marks start of unit file. \\ \hline \end{FPCltable} After the header, in the second part, first the list of all source files for the unit is written. Each name is written as a direct copy of the string in memory, i.e. a length bytes, and then all characters of the string. This list includes any file that was included in the unit source with the \var{\{\$i file\}} directive. The list is terminated with a \var{\$ff} byte marker. After this, the list of units in the \var{uses} clause is written, together with their checksums. The file is written as a string, the checksum as a longint (i.e. four bytes). Again this list is terminated with a \var{\$ff} byte marker. After that, in the third part, the definitions of all types, variables, constants, procedures and functions are written to the unit file. They are written in the following manner: First a byte is written, which determines the kind of definition that follows. then follows, as a series of bytes, a type-dependent description of the definition. The exact byte order for each type can be found in \seet{DefDef} \begin{FPCltable}{lccl}{Description of definition fields}{DefDef} \\hline Type & Start byte & Size & Stored fields \\ \hline\hline Pointer & 3 & 4 & Reference to the type pointer points to. \\ \hline Base type & 2 & 9 & \begin{tabular}[t]{l} 1 byte to indicate base type. \\ 4-byte start range \\ 4-byte end range \\ \end{tabular}\\ \hline Array type &5 & 16 & \begin{tabular}[t]{l} 4-byte reference to element type. \\ 4-byte reference to range type.\\ 4-byte start range (longint) \\ 4-byte end range (longint)\\ \end{tabular} \\ \hline Procedure & 6 & ? & \begin{tabular}[t]{l} 4-byte reference to the return type definition. \\ 2 byte Word containing modifiers. \\ 2 byte Word containing number of parameters. \\ 5 bytes per parameter.\\ 1 byte : used registers. \\ String containing the mangled name. \\ 8 bytes. \end{tabular} \\ \hline Procedural type & 21 & ? & \begin{tabular}[t]{l} 4-byte reference to the return type definition. \\ 2 byte Word containing modifiers. \\ 2 byte Word containing number of parameters. \\ 5 bytes per parameter. \\ \end{tabular} \\ \hline String & 9 & 1 & 1 byte containing the length of the string. \\ Record & 15 & variable & \begin{tabular}[t]{l} Longint indicating record length \\ list of fields, to be read as unit in itself. \\ \var{\$ff} end marker. \end{tabular} \\ \hline Class & 18 & variable & \begin{tabular}[t]{l} Longint indicating data length \\ String with mangled name of class.\\ 4 byte reference to ancestor class.\\ list of fields, to be read as unit in itself. \\ \var{\$ff} end marker. \end{tabular} \\ \hline file & 16 & 1(+4) & \begin{tabular}[t]{l} 1 byte for type of file. \\ 4-byte reference to type of typed file. \end{tabular}\\ \hline Enumeration & 19 & 4 & Biggest element. \\ \hline set & 20 & 5 & \begin{tabular}[t]{l} 4-byte reference to set element type. \\ 1 byte flag. \end{tabular} \\ \hline \hline \end{FPCltable} This list of definitions is again terminated with a \var{\$ff} byte marker. After that, a list of symbols is given, together with a reference to a definition. This represents the names of the declarations, and the definition they refer to. A reference consists of 2 words : the first word indicates the unit number (as it appears in the uses clause), and the second word is the number of the definition in that unit. A \var{nil} reference is stored as \var{\$ffffffff}. After this follows again a \var{\$ff} byte terminated list of filenames: The names of the units in the \var{uses} clause of the implementation section. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendix B %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %\chapter{Compiler and RTL source tree structure} %\label{ch:AppB} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendix C %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Compiler limits} \label{ch:AppC} Although many of the restrictions imposed by the MS-DOS system are removed by use of an extender, or use of another operating system, there still are some limitations to the compiler: \begin{enumerate} \item String constants are limited to 128 characters. All other characters are simply dropped from the definition. \item The length of generated unit files is limited to 65K for the real-mode compiler, and to 1Mb for the 32-bit compiler. This limit can be changed by changing the \var{bytearray1} type in \file{cobjects.pas} \item Procedure or Function definitions can be nested to a level of 32. \item Maximally 255 units can be used in a program when using the real-mode compiler. When using the 32-bit compiler, the limit is set to 1024. You can change this by redefining the \var{maxunits} constant in the \file{files.pas} compiler source file. \end{enumerate} \end{document}