% % $Id$ % This file is part of the FPC documentation. % Copyright (C) 1997, by Michael Van Canneyt % % The FPC documentation is free text; you can redistribute it and/or % modify it under the terms of the GNU Library General Public License as % published by the Free Software Foundation; either version 2 of the % License, or (at your option) any later version. % % The FPC Documentation is distributed in the hope that it will be useful, % but WITHOUT ANY WARRANTY; without even the implied warranty of % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU % Library General Public License for more details. % % You should have received a copy of the GNU Library General Public % License along with the FPC documentation; see the file COPYING.LIB. If not, % write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, % Boston, MA 02111-1307, USA. % \documentclass{report} % % Preamble % \usepackage{html} \usepackage{htmllist} \usepackage{epsfig} \usepackage{multicol} \usepackage{fpc} \latex{% \ifpdf \pdfinfo{/Author(Michael Van Canneyt) /Title(Programmers' Guide) /Subject(Free Pascal Programmers' guide) /Keywords(Free Pascal) } \fi } % \html{\input{fpc-html.tex}} % % Settings % \makeindex % % Start of document. % \begin{document} \title{Free Pascal \\ Programmers' manual} \docdescription{Programmers' manual for \fpc, version \fpcversion} \docversion{1.6} \input{date.inc} \author{Micha\"el Van Canneyt} \maketitle \tableofcontents \newpage \listoftables \newpage %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Introduction %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section*{About this document} This is the programmer's manual for \fpc. It describes some of the peculiarities of the \fpc compiler, and provides a glimpse of how the compiler generates its code, and how you can change the generated code. It will not, however, provide you with a detailed account of the inner workings of the compiler, nor will it tell you how to use the compiler (described in the \userref). It also will not describe the inner workings of the Run-Time Library (RTL). The best way to learn about the way the RTL is implemented is from the sources themselves. The things described here are useful if you want to do things which need greater flexibility than the standard Pascal language constructs (described in the \refref). Since the compiler is continuously under development, this document may get out of date. Wherever possible, the information in this manual will be updated. If you find something which isn't correct, or you think something is missing, feel free to contact me\footnote{at \var{Michael.VanCanneyt@wisa.be}}. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Compiler switches %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Compiler directives} \label{ch:CompSwitch} \fpc supports compiler directives in your source file. They are not the same as Turbo Pascal directives, although some are supported for compatibility. There is a distinction between local and global directives; local directives take effect from the moment they are encountered, global directives have an effect on all of the compiled code. Many switches have a long form also. If they do, then the name of the long form is given also. For long switches, the + or - character to switch the option on or off, may be replaced by \var{ON} or \var{OFF} keywords. Thus \verb|{$I+}| is equivalent to \verb|{$IOCHECKS ON}| or \verb|{$IOCHECKS +}| and \verb|{$C-}| is equivalent to \verb|{$ASSERTIONS OFF}| or \verb|{$ASSERTIONS -}| The long forms of the switches are the same as their Delphi counterparts. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Local switches \section{Local directives} \label{se:LocalSwitch} Local directives can occur more than once in a unit or program, If they have a command-line counterpart, the command-line artgument is restored as the default for each compiled file. The local directives influence the compiler's behaviour from the moment they're encountered until the moment another switch annihilates their behaviour, or the end of the current unit or program is reached. \subsection{\var{\$A} or \var{\$ALIGN}: Align Data} This switch is recognized for Turbo Pascal Compatibility, but is not yet implemented. The alignment of data will be different in any case, since \fpc is a 32-bit compiler. \subsection{\var{\$ASMMODE} : Assembler mode} \label{se:AsmReader} The \var{\{\$ASMMODE XXX\}} directive informs the compiler what kind of assembler it can expect in an \var{asm} block. The \var{XXX} should be replaced by one of the following: \begin{description} \item [att\ ] Indicates that \var{asm} blocks contain AT\&T syntax assembler. \item [intel\ ] Indicates that \var{asm} blocks contain Intel syntax assembler. \item [direct\ ] Tells the compiler that asm blocks should be copied directly to the assembler file. \end{description} These switches are local, and retain their value to the end of the unit that is compiled, unless they are replaced by another directive of the same type. The command-line switch that corresponds to this switch is \var{-R}. The default assembler reader is the AT\&T reader. \subsection{\var{\$B} or \var{\$BOOLEVAL}: Complete boolean evaluation} This switch is understood by the \fpc compiler, but is ignored. The compiler always uses shortcut evaluation, i.e. the evaluation of a boolean expression is stopped once the result of the total exression is known with certainty. So, in the following example, the function \var{Bofu}, which has a boolean result, will never get called. \begin{verbatim} If False and Bofu then ... \end{verbatim} This has as a consequence that any additional actions that are done by \var{Bofu} are not executed. \subsection{\var{\$C} or \var{\$ASSERTIONS} : Assertion support} The \var{\{\$ASSERTION\}} switch determines if assert statements are compiled into the binary or not. If the switch is on, the statement \begin{verbatim} Assert(BooleanExpression,AssertMessage); \end{verbatim} Will be compiled in the binary. If te \var{BooleanExpression} evaluates to \var{False}, the RTL will check if the \var{AssertErrorProc} is set. If it is set, it will be called with as parameters the \var{AssertMessage} message, the name of the file, the LineNumber and the address. If it is not set, a runtime error 227 is generated. The \var{AssertErrorProc} is defined as \begin{verbatim} Type TAssertErrorProc=procedure(const msg,fname:string;lineno,erroraddr:longint); Var AssertErrorProc = TAssertErrorProc; \end{verbatim} This can be used mainly for debugging purposes. The \file{SYSTEM} unit sets the \var{AssertErrorProc} to a handler that displays a message on \var{stderr} and simply exits. The \file{SYSUTILS} unit catches the run-time error 227 and raises an \var{EAssertionFailed} exception. \subsection{\var{\$DEFINE} : Define a symbol} The directive \begin{verbatim} {$DEFINE name} \end{verbatim} defines the symbol \var{name}. This symbol remains defined until the end of the current module (i.e. unit or program), or until a \var{\$UNDEF name} directive is encountered. If \var{name} is already defined, this has no effect. \var{Name} is case insensitive. The symbols that are defined in a unit, are not saved in the unit file, so they are also not exported from a unit. \subsection{\var{\$ELSE} : Switch conditional compilation} The \var{\{\$ELSE \}} switches between compiling and ignoring the source text delimited by the preceding \var{\{\$IFxxx\}} and following \var{\{\$ENDIF\}}. Any text after the \var{ELSE} keyword but before the brace is ignored: \begin{verbatim} {$ELSE some ignored text} \end{verbatim} is the same as \begin{verbatim} {$ELSE} \end{verbatim} This is useful for indication what switch is meant. \subsection{\var{\$ENDIF} : End conditional compilation} The \var{\{\$ENDIF\}} directive ends the conditional compilation initiated by the last \var{\{\$IFxxx\}} directive. Any text after the \var{ENDIF} keyword but before the closing brace is ignored: \begin{verbatim} {$ENDIF some ignored text} \end{verbatim} is the same as \begin{verbatim} {$ENDIF} \end{verbatim} This is useful for indication what switch is meant to be ended. \subsection{\var{\$ERROR} : Generate error message} The following code \begin{verbatim} {$ERROR This code is erroneous !} \end{verbatim} will display an error message when the compiler encounters it, and increase the error count of the compiler. The compiler will continue to compile, but no code will be emitted. \subsection{\var{\$F} : Far or near functions} This directive is recognized for compatibility with Turbo Pascal. Under the 32-bit programming model, the concept of near and far calls have no meaning, hence the directive is ignored. A warning is printed to the screen, telling you so. As an example, the following piece of code : \begin{verbatim} {$F+} Procedure TestProc; begin Writeln ('Hello From TestProc'); end; begin testProc end. \end{verbatim} Generates the following compiler output: \begin{verbatim} malpertuus: >pp -vw testf Compiler: ppc386 Units are searched in: /home/michael;/usr/bin/;/usr/lib/ppc/0.9.1/linuxunits Target OS: Linux Compiling testf.pp testf.pp(1) Warning: illegal compiler switch 7739 kB free Calling assembler... Assembled... Calling linker... 12 lines compiled, 1.00000000000000E+0000 \end{verbatim} You can see that the verbosity level was set to display warnings. If you declare a function as \var{Far} (this has the same effect as setting it between \var{\{\$F+\}...\{\$F-\}} directives), the compiler also generates a warning : \begin{verbatim} testf.pp(3) Warning: FAR ignored \end{verbatim} The same story is true for procedures declared as \var{Near}. The warning displayed in that case is: \begin{verbatim} testf.pp(3) Warning: NEAR ignored \end{verbatim} \subsection{\var{\$FATAL} : Generate fatal error message} The following code \begin{verbatim} {$FATAL This code is erroneous !} \end{verbatim} will display an error message when the compiler encounters it, and the compiler will immediatly stop the compilation process. This is mainly useful inc conjunction wih \var{\{\$IFDEF \}} or \var{\{\$IFOPT \}} statements. \subsection{\var{\$GOTO} : Support \var{Goto} and \var{Label}} If \var{\{\$GOTO ON\}} is specified, the compiler will support \var{Goto} statements and \var{Label} declarations. By default, \var{\$GOTO OFF} is assumed. This directive corresponds to the \var{-Sg} command-line option. As an example, the following code can be compiled: \begin{verbatim} {$GOTO ON} label Theend; begin If ParamCount=0 then GoTo TheEnd; Writeln ('You specified command-line options'); TheEnd: end. \end{verbatim} \begin{remark} If you compile assembler code not in direct mode (using the intel or assembler readers) you must declare any labels you use in the assembler code and use \var{\{\$GOTO ON\}}. If you compile in Direct mode then this is not necessary. \end{remark} \subsection{\var{\$H} or \var{\$LONGSTRINGS} : Use AnsiStrings} If \var{\{\$LONGSTRINGS ON\}} is specified, the keyword \var{String} (no length specifier) will be treated as \var{AnsiString}, and the compiler will treat the corresponding varible as an ansistring, and will generate corresponding code. By default, the use of ansistrings is off, corresponding to \var{\{\$H-\}}. The system unit is compiled without ansistrings, all its functions accept shortstrng arguments. The same is true for all RTL units, except the \file{sysutils} unit, which is compiled with ansistrings. \subsection{\var{\$HINT} : Generate hint message} If the generation of hints is turned on, through the \var{-vh} command-line option or the \var{\{\$HINTS ON\}} directive, then \begin{verbatim} {$Hint This code should be optimized } \end{verbatim} will display a hint message when the compiler encounters it. By default, no hints are generated. \subsection{\var{\$HINTS} : Emit hints} \var{\{\$HINTS ON\}} switches the generation of hints on. \var{\{\$HINTS OFF\}} switches the generation of hints off. Contrary to the command-line option \var{-vh} this is a local switch, this is useful for checking parts of your code. \subsection{\var{\$IF} : Start conditional compilation} The directive \var{\{\$IF expr\}} will continue the compilation if the boolean expression \var{expr} evaluates to \var{true}. If the compilation evaluates to false, then the source is skipped to the first \var{\{\$ELSE\}} or \var{\{\$ENDIF\}} directive. The compiler must be able to evaluate the expression at parse time. This means that you cannot use variables or constants that are defined in the source. Macros and symbols may be used, however. More information on this can be found in the section about conditionals. \subsection{\var{\$IFDEF Name} : Start conditional compilation} If the symbol \var{Name} is not defined then the \var{\{\$IFDEF name\}} will skip the compilation of the text that follows it to the first \var{\{\$ELSE\}} or \var{\{\$ENDIF\}} directive. If \var{Name} is defined, then compilation continues as if the directive wasn't there. \subsection{\var{\$IFNDEF} : Start conditional compilation} If the symbol \var{Name} is defined then the \var{\{\$IFNDEF name\}} will skip the compilation of the text that follows it to the first \var{\{\$ELSE\}} or \var{\{\$ENDIF\}} directive. If it is not defined, then compilation continues as if the directive wasn't there. \subsection{\var{\$IFOPT} : Start conditional compilation} The \var{\{\$IFOPT switch\}} will compile the text that follows it if the switch \var{switch} is currently in the specified state. If it isn't in the specified state, then compilation continues after the corresponding \var{\{\$ELSE\}} or \var{\{\$ENDIF\}} directive. As an example: \begin{verbatim} {$IFOPT M+} Writeln ('Compiled with type information'); {$ENDIF} \end{verbatim} Will compile the writeln statement if generation of type information is on. \begin{remark} The \var{\{\$IFOPT\}} directive accepts only short options, i.e. \var{\{\$IFOPT TYPEINFO\}} will not be accepted. \subsection{\var{\$INFO} : Generate info message} \end{remark} If the generation of info is turned on, through the \var{-vi} command-line option, then \begin{verbatim} {$INFO This was coded on a rainy day by Bugs Bunny } \end{verbatim} will display an info message when the compiler encounters it. This is useful in conjunction with the \var{\{\$IFDEF\}} directive, to show information about which part of the code is being compiled. \subsection{\var{\$INLINE} : Allow inline code.} The \var{\{\$INLINE ON\}} directive tells the compiler that the \var{Inline} procedure modifier should be allowed. Procedures that are declared inline are copied to the places where they are called. This has the effect that there is no actual procedure call, the code of the procedure is just copied to where the procedure is needed, this results in faster execution speed if the function or procedure is used a lot. By default, \var{Inline} procedures are not allowed. You need to specify this directive if you want to use inlined code. This directive is equivalent to the command-line switch \var{-Si}. \begin{remark} \begin{enumerate} \item Inline code is NOT exported from a unit. This means that if you call an inline procedure from another unit, a normal procedure call will be performed. Only inside units, \var{Inline} procedures are really inline. \item You cannot make recursive inline functions. i.e. an inline function that calls itself is not allowed. \end{enumerate} \end{remark} \subsection{\var{\$I} or \var{\$IOCHECKS} : Input/Output checking} The \var{\{\$I-\}} or \var{\{\$IOCHECKS OFF\}} directive tells the compiler not to generate input/output checking code in your program. By default, the compiler generates this code\footnote{This behaviour changed in the 0.99.13 release of the compiler. Earlier versions by default did not generate this code.}, you must switch it on using the \var{-Ci} command-line switch. If you compile using the \var{-Ci} compiler switch, the \fpc compiler inserts input/output checking code after every input/output call in your program. If an error occurred during input or output, then a run-time error will be generated. Use this switch if you wish to avoid this behavior. If you still want to check if something went wrong, you can use the \var{IOResult} function to see if everything went without problems. Conversely, \var{\{\$I+\}} will turn error-checking back on, until another directive is encountered which turns it off again. The most common use for this switch is to check if the opening of a file went without problems, as in the following piece of code: \begin{verbatim} ... assign (f,'file.txt'); {$I-} rewrite (f); {$I+} if IOResult<>0 then begin Writeln ('Error opening file : "file.txt"'); exit end; ... \end{verbatim} See the \var{IOResult} function explanantion in the referece manual for a detailed description of the possible errors that can occur when using input/output checking. \subsection{\var{\$I} or \var{\$INCLUDE} : Include file } The \var{\{\$I filename\}} or \var{\{\$INCLUDE filename\}} directive tells the compiler to read further statements from the file \var{filename}. The statements read there will be inserted as if they occurred in the current file. The compiler will append the \file{.pp} extension to the file if you don't specify an extension yourself. Do not put the filename between quotes, as they will be regarded as part of the file's name. You can nest included files, but not infinitely deep. The number of files is restricted to the number of file descriptors available to the \fpc compiler. Contrary to Turbo Pascal, include files can cross blocks. I.e. you can start a block in one file (with a \var{Begin} keyword) and end it in another (with a \var{End} keyword). The smallest entity in an include file must be a token, i.e. an identifier, keyword or operator. The compiler will look for the file to include in the following places: \begin{enumerate} \item It will look in the path specified in the include file name. \item It will look in the directory where the current source file is. \item it will look in all directories specified in the include file search path. \end{enumerate} You can add directories to the include file search path with the \var{-I} command-line option. \subsection{\var{\$I} or \var{\$INCLUDE} : Include compiler info} In this form: \begin{verbatim} {$INCLUDE %xxx%} \end{verbatim} where \var{xxx} is one of \var{TIME}, \var{DATE}, \var{FPCVERSION} or \var{FPCTARGET}, will generate a macro with the value of these things. If \var{xxx} is none of the above, then it is assumed to be the value of an environment variable. It's value will be fetched, and inserted in the code as if it were a string. For example, the following program \begin{verbatim} Program InfoDemo; Const User = {$I %USER%}; begin Write ('This program was compiled at ',{$I %TIME%}); Writeln (' on ',{$I %DATE%}); Writeln ('By ',User); Writeln ('Compiler version : ',{$I %FPCVERSION%}); Writeln ('Target CPU : ',{$I %FPCTARGET%}); end. \end{verbatim} Creates the following output : \begin{verbatim} This program was compiled at 17:40:18 on 1998/09/09 By michael Compiler version : 0.99.7 Target CPU : i386 \end{verbatim} % Assembler type \subsection{\var{\$I386\_XXX} : Specify assembler format} This switch selects the assembler reader. \var{\{\$I386\_XXX\}} has the same effect as \var{\{\$ASMMODE XXX\}}, \sees{AsmReader} This switch is deprecated, the \var{\{\$ASMMODE XXX\}} directive should be used instead. \subsection{\var{\$L} or \var{\$LINK} : Link object file} The \var{\{\$L filename\}} or \var{\{\$LINK filename\}} directive tells the compiler that the file \file{filename} should be linked to your program. This cannot be used for libraries, see section \sees{linklib} for that. The compiler will look for this file in the following way: \begin{enumerate} \item It will look in the path specified in the object file name. \item It will look in the directory where the current source file is. \item it will look in all directories specified in the object file search path. \end{enumerate} You can add directories to the object file search path with the \var{-Fo} option. On \linux systems, the name is case sensitive, and must be typed exactly as it appears on your system. \begin{remark} Take care that the object file you're linking is in a format the linker understands. Which format this is, depends on the platform you're on. Typing \var{ld} on the command line gives a list of formats \var{ld} knows about. \end{remark} You can pass other files and options to the linker using the \var{-k} command-line option. You can specify more than one of these options, and they will be passed to the linker, in the order that you specified them on the command line, just before the names of the object files that must be linked. \subsection{\var{\$LINKLIB} : Link to a library} \label{se:linklib} The \var{\{\$LINKLIB name\}} will link to a library \file{name}. This has the effect of passing \var{-lname} to the linker. As an example, consider the following unit: \begin{verbatim} unit getlen; interface {$LINKLIB c} function strlen (P : pchar) : longint;cdecl; implementation function strlen (P : pchar) : longint;cdecl;external; end. \end{verbatim} If one would issue the command \begin{verbatim} ppc386 foo.pp \end{verbatim} where foo.pp has the above unit in its \var{uses} clause, then the compiler would link your program to the c library, by passing the linker the \var{-lc} option. The same effect could be obtained by removing the linklib directive in the above unit, and specify \var{-k-lc} on the command-line: \begin{verbatim} ppc386 -k-lc foo.pp \end{verbatim} \subsection{\var{\$M} or \var{\$TYPEINFO} : Generate type info} For classes that are compiled in the \var{\{\$M+ \}} or \var{\{\$TYPEINFO ON\}} state, the compiler will generate Run-Time Type Information (RTTI). All descendent objects of an object that was compiled in the \var{\{\$M+\}} state will get RTTI information too, as well as any published classes. By default, no Run-Time Type Information is generated. The \var{TPersistent} object that is present in the FCL (Free Component Library) is generated in the \var{\{\$M+\}} state. The generation of RTTI allows programmers to stream objects, and to access published properties of objects, without knowing the actual class of the object. The run-time type information is accessible through the \var{TypInfo} unit, which is part of the \fpc Run-Time Library. \begin{remark} that the streaming system implemented by \fpc requires that you make streamable components descendent from \var{TPersistent}. \end{remark} \subsection{\var{\$MACRO} : Allow use of macros.} In the \var{\{\$MACRO ON\}} state, the compiler allows you to use C-style (although not as elaborate) macros. Macros provide a means for simple text substitution. More information on using macros can be found in the \sees{Macros} section. This directive is equivalent to the command-line switch \var{-Sm}. By default, macros are not allowed. \subsection{\var{\$MAXFPUREGISTERS} : Maximum number of FPU registers for variables} The \var{\{\$MAXFPUREGISTERS XXX\}} directive tells the compiler how much floating point variables can be kept in the floating point processor registers. This switch is ignored unless the \var{-Or} (use register variables) optimization is used. Since version 0.99.14, the \fpc compiler supports floating point register variables; the content of these variables is not stored on the stack, but is kept in the floating point processor stack. This is quite tricky because the Intel FPU stack is limited to 8 entries. The compiler uses a heuristic algorithm to determine how much variables should be put onto the stack: in leaf procedures it is limited to 3 and in non leaf procedures to 1. But in case of a deep call tree or, even worse, a recursive procedure this can still lead to a FPU stack overflow, so the user can tell the compiler how much (floating point) variables should be kept in registers. The directive accepts the following arguments: \begin{description} \item [N] where \var{N} is the maximum number of FPU registers to use. Currently this can be in the range 0 to 7. \item[Normal] restores the heuristic and standard behavior. \item[Default] restores the heuristic and standard behaviour. \end{description} \begin{remark} The directive is valid untill the end of the current procedure. \end{remark} \subsection{\var{\$MESSAGE} : Generate info message} If the generation of info is turned on, through the \var{-vi} command-line option, then \begin{verbatim} {$MESSAGE This was coded on a rainy day by Bugs Bunny } \end{verbatim} will display an info message when the compiler encounters it. The effect is the same as the \var{\{\$INFO\}} directive. \subsection{\var{\$MMX} : Intel MMX support} As of version 0.9.8, \fpc supports optimization for the \textbf{MMX} Intel processor (see also \ref{ch:MMXSupport}). This optimizes certain code parts for the \textbf{MMX} Intel processor, thus greatly improving speed. The speed is noticed mostly when moving large amounts of data. Things that change are \begin{itemize} \item Data with a size that is a multiple of 8 bytes is moved using the \var{movq} assembler instruction, which moves 8 bytes at a time \end{itemize} \begin{remark} MMX support is NOT emulated on non-MMX systems, i.e. if the processor doesn't have the MMX extensions, you cannot use the MMX optimizations. \end{remark} When \textbf{MMX} support is on, you aren't allowed to do floating point arithmetic. You are allowed to move floating point data, but no arithmetic can be done. If you wish to do floating point math anyway, you must first switch of \textbf{MMX} support and clear the FPU using the \var{emms} function of the \file{cpu} unit. The following example will make this more clear: \begin{verbatim} Program MMXDemo; uses cpu; var d1 : double; a : array[0..10000] of double; i : longint; begin d1:=1.0; {$mmx+} { floating point data is used, but we do _no_ arithmetic } for i:=0 to 10000 do a[i]:=d2; { this is done with 64 bit moves } {$mmx-} emms; { clear fpu } { now we can do floating point arithmetic } .... end. \end{verbatim} See, however, the chapter on MMX (\ref{ch:MMXSupport}) for more information on this topic. \subsection{\var{\$NOTE} : Generate note message} If the generation of notes is turned on, through the \var{-vn} command-line option or the \var{\{\$NOTES ON\}} directive, then \begin{verbatim} {$NOTE Ask Santa Claus to look at this code } \end{verbatim} will display a note message when the compiler encounters it. \subsection{\var{\$NOTES} : Emit notes} \var{\{\$NOTES ON\}} switches the generation of notes on. \var{\{\$NOTES OFF\}} switches the generation of notes off. Contrary to the command-line option \var{-vn} this is a local switch, this is useful for checking parts of your code. By default, \var{\{\$NOTES \}} is off. \subsection{\var{\$OUTPUT\_FORMAT} : Specify the output format} \var{\{\$OUTPUT\_FORMAT format\}} has the same functionality as the \var{-A} command-line option : It tells the compiler what kind of object file must be generated. You can specify this switch only {\em before} the \var{Program} or \var{Unit} clause in your source file. The different kinds of formats are shown in \seet{Formats}. The default output format depends on the platform the compiler was compiled on. \begin{FPCltable}{ll}{Formats generated by the x86 compiler}{Formats} \hline Switch value & Generated format \\ \hline AS & AT\&T assembler file. \\ AS\_AOUT & Go32v1 assembler file.\\ ASW & AT\&T Win32 assembler file. \\ COFF & Go32v2 COFF object file.\\ MASM & Masm assembler file.\\ NASM & Nasm assembler file.\\ NASMCOFF & Nasm assembler file (COFF format).\\ NASMELF & Nasm assembler file (ELF format).\\ PECOFF & PECOFF object file (Win32).\\ TASM & Tasm assembler file.\\ \end{FPCltable} \subsection{\var{\$P} or \var{\$OPENSTRINGS} : Use open strings} If this switch is on, all function or procedure parameters of type string are considered to be open string parameters; this parameter only has effect for short strings, not for ansistrings. When using openstrings, the declared type of the string can be different from the type of string that is actually passed, even for strings that are passed by reference. The declared size of the string passed can be examined with the \var{High(P)} call. Default the use of openstrings is off. \subsection{\var{\$PACKENUM} : Minimum enumeration type size} This directive tells the compiler the minimum number of bytes it should use when storing enumerated types. It is of the following form: \begin{verbatim} {$PACKENUM xxx} {$MINENUMSIZE xxx} \end{verbatim} Where the form with \var{\$MINENUMSIZE} is for Delphi compatibility. \var{xxx} can be one of \var{1,2} or \var{4}, or \var{NORMAL} or \var{DEFAULT}, corresponding to the default value of 4. As an alternative form one can use \var{\{\$Z1\}}, \var{\{\$Z2\}} \var{\{\$Z4\}}. Contrary to Delphi, the default size is 4 bytes (\var{\{\$Z4\}}). So the following code \begin{verbatim} {$PACKENUM 1} Type Days = (monday, tuesday, wednesday, thursday, friday, saturday, sunday); \end{verbatim} will use 1 byte to store a variable of type \var{Days}, whereas it nomally would use 4 bytes. The above code is equivalent to \begin{verbatim} {$Z1} Type Days = (monday, tuesday, wednesday, thursday, friday, saturday, sunday); \end{verbatim} \begin{remark} Sets are always put in 32 bits or 32 bytes, this cannot be changed (yet). \end{remark} \subsection{\var{\$PACKRECORDS} : Alignment of record elements} This directive controls the byte alignment of the elements in a record, object or class type definition. It is of the following form: \begin{verbatim} {$PACKRECORDS n} \end{verbatim} Where \var{n} is one of 1,2,4,16,\var{C}, \var{NORMAL} or \var{DEFAULT}. This means that the elements of a record that have size greater than \var{n} will be aligned on \var{n} byte boundaries. Elements with size less than or equal to \var{n} will be aligned to a natural boundary, i.e. to a power of two that is equal to or larger than the element's size. The type \var{C} is used to specify alignment as by the GNU CC compiler. It should be used only when making import units for C routines. The default alignment (which can be selected with \var{DEFAULT}) is 2, contrary to Turbo Pascal, where it is 1. More information on this and an example program can be found in the reference guide, in the section about record types. \begin{remark} Sets are always put in 32 bit or 32 bytes, this cannot be changed \end{remark} \subsection{\var{\$Q} \var{\$OVERFLOWCHECKS}: Overflow checking} The \var{\{\$Q+\}} or \var{\{\$OVERFLOWCHECKS ON\}} directive turns on integer overflow checking. This means that the compiler inserts code to check for overflow when doing computations with integers. When an overflow occurs, the run-time library will print a message \var{Overflow at xxx}, and exit the program with exit code 215. \begin{remark} Overflow checking behaviour is not the same as in Turbo Pascal since all arithmetic operations are done via 32-bit values. Furthermore, the \var{Inc()} and \var{Dec} standard system procedures {\em are} checked for overflow in \fpc, while in Turbo Pascal they are not. \end{remark} Using the \var{\{\$Q-\}} switch switches off the overflow checking code generation. The generation of overflow checking code can also be controlled using the \var{-Co} command line compiler option (see \userref). \subsection{\var{\$R} or \var{\$RANGECHECKS} : Range checking} By default, the compiler doesn't generate code to check the ranges of array indices, enumeration types, subrange types, etc. Specifying the \var{\{\$R+\}} switch tells the computer to generate code to check these indices. If, at run-time, an index or enumeration type is specified that is out of the declared range of the compiler, then a run-time error is generated, and the program exits with exit code 201. The \var{\{\$RANGECHECKS OFF\}} switch tells the compiler not to generate range checking code. This may result in faulty program behaviour, but no run-time errors will be generated. \begin{remark} \item The standard functions \var{val} and \var{Read} will also check ranges when the call is compiled in \var{\{\$R+\}} mode. \end{remark} \subsection{\var{\$SATURATION} : Saturation operations} This works only on the intel compiler, and MMX support must be on (\var{\{\$MMX +\}}) for this to have any effect. See the section on saturation support (\sees{SaturationSupport}) for more information on the effect of this directive. \subsection{\var{\$SMARTLINK} : Use smartlinking} A unit that is compiled in the \var{\{\$SMARTLINK ON\}} state will be compiled in such a way that it can be used for smartlinking. This means that the unit is chopped in logical pieces: each procedure is put in it's own object file, and all object files are put together in a big archive. When using such a unit, only the pieces of code that you really need or call, will be linked in your program, thus reducing the size of your executable substantially. Beware: using smartlinked units slows down the compilation process, because a separate object file must be created for each procedure. If you have units with many functions and procedures, this can be a time consuming process, even more so if you use an external assembler (the assembler is called to assemble each procedure or function code block separately). The smartlinking directive should be specified {\em before} the unit declaration part: \begin{verbatim} {$SMARTLINK ON} Unit MyUnit; Interface ... \end{verbatim} This directive is equivalent to the \var{-Cx} command-line switch. \subsection{\var{\$STATIC} : Allow use of \var{Static} keyword.} If you specify the \var{\{\$STATIC ON\}} directive, then \var{Static} methods are allowed for objects. \var{Static} objects methods do not require a \var{Self} variable. They are equivalent to \var{Class} methods for classes. By default, \var{Static} methods are not allowed. Class methods are always allowed. By default, the address operator returns an untyped pointer. This directive is equivalent to the \var{-St} command-line option. \subsection{\var{\$STOP} : Generate fatal error message} The following code \begin{verbatim} {$STOP This code is erroneous !} \end{verbatim} will display an error message when the compiler encounters it. The compiler will immediatly stop the compilation process. It has the same effect as the \var{\{\$FATAL\}} directive. \subsection{\var{\$T} or \var{\$TYPEDADDRESS} : Typed address operator (@)} In the \var{\{\$T+\}} or \var{\{\$TYPEDADDRESS ON\}} state the @ operator, when applied to a variable, returns a result of type \var{\^{}T}, if the type of the variable is \var{T}. In the \var{\{\$T-\}} state, the result is always an untyped pointer, which is assignment compatible with all other pointer types. \subsection{\var{\$UNDEF} : Undefine a symbol} The directive \begin{verbatim} {$UNDEF name} \end{verbatim} un-defines the symbol \var{name} if it was previously defined. \var{Name} is case insensitive. \subsection{\var{\$V} or \var{\$VARSTRINGCHECKS} : Var-string checking} When in the \var{+} or \var{ON} state, the compiler checks that strings passed as parameters are of the same, identical, string type as the declared parameters of the procedure. \subsection{\var{\$WAIT} : Wait for enter key press} If the compiler encounters a \begin{verbatim} {$WAIT } \end{verbatim} directive, it will resume compiling only after the user has pressed the enter key. If the generation of info messages is turned on, then the compiler will display the follwing message: \begin{verbatim} Press to continue \end{verbatim} before waiting for a keypress. Careful ! This may interfere with automatic compilation processes. It should be used for debugging purposes only. \subsection{\var{\$WARNING} : Generate warning message} If the generation of warnings is turned on, through the \var{-vw} command-line option or the \var{\{\$WARNINGS ON\}} directive, then \begin{verbatim} {$WARNING This is dubious code } \end{verbatim} will display a warning message when the compiler encounters it. \subsection{\var{\$WARNINGS} : Emit warnings} \var{\{\$WARNINGS ON\}} switches the generation of warnings on. \var{\{\$WARNINGS OFF\}} switches the generation of warnings off. Contrary to the command-line option \var{-vw} this is a local switch, this is useful for checking parts of your code. By default, no warnings are emitted. \subsection{\var{\$X} or \var{\$EXTENDEDSYNTAX} : Extended syntax} Extended syntax allows you to drop the result of a function. This means that you can use a function call as if it were a procedure. Standard this feature is on. You can switch it off using the \var{\{\$X-\}} or \var{\{\$EXTENDEDSYNTAX OFF\}}directive. The following, for instance, will not compile : \begin{verbatim} function Func (var Arg : sometype) : longint; begin ... { declaration of Func } end; ... {$X-} Func (A); \end{verbatim} The reason this construct is supported is that you may wish to call a function for certain side-effects it has, but you don't need the function result. In this case you don't need to assign the function result, saving you an extra variable. The command-line compiler switch \var{-Sa1} has the same effect as the \var{\{\$X+\}} directive. By default, extended syntax is assumed. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Global switches \section{Global directives} \label{se:GlobalSwitch} Global directives affect the whole of the compilation process. That is why they also have a command-line counterpart. The command-line counterpart is given for each of the directives. \subsection{\var{\$APPTYPE} : Specify type of application (Win32 only)} The \var{\{\$APPTYPE XXX\}} accepts one argument that can have two possible values : \var{GUI} or \var{CONSOLE}. It is used to tell the windows Operating system if an application is a console application or a graphical application. By default, a program compiled by \fpc is a console application. Running it will display a console window. Specifying the \var{\{\$APPTYPE GUI\}} directive will mark the application as a graphical application; no console window will be opened when the application is run. If run from the command-line, the command prompt will be returned immediatly after the application was started. Care should be taken when compiling \var{GUI} applications; the \var{Input} and \var{Output} files are not available in a GUI application, and attempting to read from or write to them will result in a run-time error. It is possible to determine the application type of a windows application at runtime. The \var{IsConsole} constant, declared in the Win32 system unit as \begin{verbatim} Const IsConsole : Boolean \end{verbatim} contains \var{True} if the application is a console application, \var{False} if the application is a GUI application. \subsection{\var{\$D} or \var{\$DEBUGINFO}: Debugging symbols} When this switch is on (\var{\{\$DEBUGINFO ON\}}), the compiler inserts GNU debugging information in the executable. The effect of this switch is the same as the command-line switch \var{-g}. By default, insertion of debugging information is off. \subsection{\var{\$DESCRIPTION}} This switch is recognised for compatibility only, but is ignored completely by the compiler. At a later stage, this switch may be activated. \subsection{\var{\$E} : Emulation of coprocessor} This directive controls the emulation of the coprocessor. There is no command-line counterpart for this directive. \subsubsection{ Intel x86 version } When this switch is enabled, all floating point instructions which are not supported by standard coprocessor emulators will give out a warning. The compiler itself doesn't do the emulation of the coprocessor. To use coprocessor emulation under \dos go32v1 there is nothing special required, as it is handled automatically. (As of version 0.99.10, the go32v1 platform is no longer be supported) To use coprocessor emulation under \dos go32v2 you must use the emu387 unit, which contains correct initialization code for the emulator. Under \linux, the kernel takes care of the coprocessor support. \subsubsection{ Motorola 680x0 version } When the switch is on, no floating point opcodes are emitted by the code generator. Instead, internal run-time library routines are called to do the necessary calculations. In this case all real types are mapped to the single IEEE floating point type. \begin{remark} By default, emulation is on. It is possible to intermix emulation code with real floating point opcodes, as long as the only type used is single or real. \end{remark} \subsection{\var{\$G} : Generate 80286 code} This option is recognised for Turbo Pascal compatibility, but is ignored, since the compiler works only on 386 or higher Intel processors. \subsection{\var{\$INCLUDEPATH} : Specify include path.} This option serves to specify the include path, where the compiler looks for include files. \var{\{\$INCLUDEPATH XXX\}} will add \var{XXX} to the include path. \var{XXX} can contain one or more paths, separated by semi-colons or colons. for example \begin{verbatim} {$INCLUDEPATH ../inc;../i386} {$I strings.inc} \end{verbatim} Will add the directories \file{../inc} and \file{../i386} to the include path of the compiler. The compiler will look for the file \file{strings.inc} in both these directories, and will include the first found file. This directive is equivalent to the \var{-Fi} command-line switch. Caution is in order when using this directive: If you distribute files, the places of the files may not be the same as on your machine; moreover, the directory structure may be different. In general it would be fair to say that you should avoid using {\em absolute} paths, instead use {\em relative} paths, as in the example above. Only use this directive if you are certain of the places where the files reside. If you are not sure, it is better practice to use makefiles and makefile variables. \subsection{\var{\$L} or \var{\$LOCALSYMBOLS}: Local symbol information} This switch (not to be confused with the \var{\{\$L file\}} file linking directive) is recognised for Turbo Pascal compatibility, but is ignored. Generation of symbol information is controlled by the \var{\$D} switch. \subsection{\var{\$LIBRARYPATH} : Specify library path.} This option serves to specify the library path, where the linker looks for static or dynamic libraries. \var{\{\$LIBRARYPATH XXX\}} will add \var{XXX} to the library path. \var{XXX} can contain one or more paths, separated by semi-colons or colons. for example \begin{verbatim} {$LIBRARYPATH /usr/X11/lib;/usr/local/lib} {$LINKLIB X11} \end{verbatim} Will add the directories \file{/usr/X11/lib} and \file{/usr/local/lib} to the linker library path. The linker will look for the library \file{libX11.so} in both these directories, and use the first found file. This directive is equivalent to the \var{-Fl} command-line switch. Caution is in order when using this directive: If you distribute files, the places of the libraries may not be the same as on your machine; moreover, the directory structure may be different. In general it would be fair to say that you should avoid using this directive. If you are not sure, it is better practice to use makefiles and makefile variables. \subsection{\var{\$M} or \var{\$MEMORY}: Memory sizes} This switch can be used to set the heap and stacksize. It's format is as follows: \begin{verbatim} {$M StackSize,HeapSize} \end{verbatim} where \var{StackSize} and \var{HeapSize} should be two integer values, greater than 1024. The first number sets the size of the stack, and the second the size of the heap. (Stack setting is ignored under \linux). The two numbers can be set on the command line using the \var{-Ch} and \var{-Cs} switches. \subsection{\var{\$MODE} : Set compiler compatibility mode} The \var{\{\$MODE\}} sets the compatibility mode of the compiler. This is equivalent to setting one of the command-line options \var{-So}, \var{-Sd}, \var{-Sp} or \var{-S2}. it has the following arguments: \begin{description} \item[Default] Default mode. This reverts back to the mode that was set on the command-line. \item[Delphi] Delphi compatibility mode. All object-pascal extensions are enabled. This is the same as the command-line option \var{-Sd}. \item[TP] Turbo pascal compatibility mode. Object pascal extensions are disabled, except ansistrings, which remain valid. This is the same as the command-line option \var{-So}. \item[FPC] FPC mode. This is the default, if no command-line switch is supplied. \item[OBJFPC] Object pascal mode. This is the same as the \var{-S2} command-line option. \item[GPC] GNU pascal mode. This is the same as the \var{-Sp} command-line option. \end{description} For an exact description of each of these modes, see appendix \ref{ch:AppD}, on page \pageref{ch:AppD}. \subsection{\var{\$N} : Numeric processing } This switch is recognised for Turbo Pascal compatibility, but is otherwise ignored, since the compiler always uses the coprocessor for floating point mathematics. \subsection{\var{\$O} : Overlay code generation } This switch is recognised for Turbo Pascal compatibility, but is otherwise ignored. \subsection{\var{\$OBJECTPATH} : Specify object path.} This option serves to specify the object path, where the compiler looks for object files. \var{\{\$OBJECTPATH XXX\}} will add \var{XXX} to the object path. \var{XXX} can contain one or more paths, separated by semi-colons or colons. for example \begin{verbatim} {$OBJECTPATH ../inc;../i386} {$L strings.o} \end{verbatim} Will add the directories \file{../inc} and \file{../i386} to the object path of the compiler. The compiler will look for the file \file{strings.o} in both these directories, and will link the first found file in the program. This directive is equivalent to the \var{-Fo} command-line switch. Caution is in order when using this directive: If you distribute files, the places of the files may not be the same as on your machine; moreover, the directory structure may be different. In general it would be fair to say that you should avoid using {\em absolute} paths, instead use {\em relative} paths, as in the example above. Only use this directive if you are certain of the places where the files reside. If you are not sure, it is better practice to use makefiles and makefile variables. \subsection{\var{\$S} : Stack checking} The \var{\{\$S+\}} directive tells the compiler to generate stack checking code. This generates code to check if a stack overflow occurred, i.e. to see whether the stack has grown beyond its maximally allowed size. If the stack grows beyond the maximum size, then a run-time error is generated, and the program will exit with exit code 202. Specifying \var{\{\$S-\}} will turn generation of stack-checking code off. The command-line compiler switch \var{-Ct} has the same effect as the \var{\{\$S+\}} directive. By default, no stack checking is performed. \subsection{\var{\$UNITPATH} : Specify unit path.} This option serves to specify the unit path, where the compiler looks for unit files. \var{\{\$UNITPATH XXX\}} will add \var{XXX} to the unit path. \var{XXX} can contain one or more paths, separated by semi-colons or colons. for example \begin{verbatim} {$UNITPATH ../units;../i386/units} Uses strings; \end{verbatim} Will add the directories \file{../units} and \file{../i386/units} to the unit path of the compiler. The compiler will look for the file \file{strings.ppu} in both these directories, and will link the first found file in the program. This directive is equivalent to the \var{-Fu} command-line switch. Caution is in order when using this directive: If you distribute files, the places of the files may not be the same as on your machine; moreover, the directory structure may be different. In general it would be fair to say that you should avoid using {\em absolute} paths, instead use {\em relative} paths, as in the example above. Only use this directive if you are certain of the places where the files reside. If you are not sure, it is better practice to use makefiles and makefile variables. \subsection{\var{\$W} or \var{\$STACKFRAMES} : Generate stackframes} The \var{\{\$W\}} switch directove controls the generation of stackframes. In the on state (\var{\{\$STACKFRAMES ON\}}), the compiler will generate a stackframe for every procedure or function. In the off state, the compiler will omit the generation of a stackframe if the following conditions are satisfied: \begin{itemize} \item The procedure has no parameters. \item The procedure has no local variables. \item If the procedure is not an \var{assembler} procedure, it must not have a \var{asm ... end;} block. \item it is not a constuctor or desctructor. \end{itemize} If these conditions are satisfied, the stack frame will be omitted. \subsection{\var{\$Y} or \var{\$REFERENCEINFO} : Insert Browser information} This switch controls the generation of browser inforation. It is recognized for compatibility with Turbo Pascal and Delphi only, as Browser information generation is not yet fully supported. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Using conditionals and macros %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Using conditionals, messages and macros} \label{ch:CondMessageMacro} The \fpc compiler supports conditionals as in normal Turbo Pascal. It does, however, more than that. It allows you to make macros which can be used in your code, and it allows you to define messages or errors which will be displayed when compiling. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Conditionals \section{Conditionals} \label{se:Conditionals} The rules for using conditional symbols are the same as under Turbo Pascal. Defining a symbol goes as follows: \begin{verbatim} {$Define Symbol } \end{verbatim} From this point on in your code, the compiler knows the symbol \var{Symbol}. Symbols are, like the Pascal language, case insensitive. You can also define a symbol on the command line. the \var{-dSymbol} option defines the symbol \var{Symbol}. You can specify as many symbols on the command line as you want. Undefining an existing symbol is done in a similar way: \begin{verbatim} {$Undef Symbol } \end{verbatim} If the symbol didn't exist yet, this doesn't do anything. If the symbol existed previously, the symbol will be erased, and will not be recognized any more in the code following the \verb|{$Undef ...}| statement. You can also undefine symbols from the command line with the \var{-u} command-line switch.. To compile code conditionally, depending on whether a symbol is defined or not, you can enclose the code in a \verb|{$ifdef Symbol}| .. \verb|{$endif}| pair. For instance the following code will never be compiled : \begin{verbatim} {$Undef MySymbol} {$ifdef Mysymbol} DoSomething; ... {$endif} \end{verbatim} Similarly, you can enclose your code in a \verb|{$Ifndef Symbol}| .. \verb|{$endif}| pair. Then the code between the pair will only be compiled when the used symbol doesn't exist. For example, in the following example, the call to the \var{DoSomething} will always be compiled: \begin{verbatim} {$Undef MySymbol} {$ifndef Mysymbol} DoSomething; ... {$endif} \end{verbatim} You can combine the two alternatives in one structure, namely as follows \begin{verbatim} {$ifdef Mysymbol} DoSomething; {$else} DoSomethingElse {$endif} \end{verbatim} In this example, if \var{MySymbol} exists, then the call to \var{DoSomething} will be compiled. If it doesn't exist, the call to \var{DoSomethingElse} is compiled. The \fpc compiler defines some symbols before starting to compile your program or unit. You can use these symbols to differentiate between different versions of the compiler, and between different compilers. In \seet{Symbols}, a list of pre-defined symbols is given\footnote{Remark: The \var{FPK} symbol is still defined for compatibility with older versions.}. In that table, you should change \var{v} with the version number of the compiler you're using, \var{r} with the release number and \var{p} with the patch-number of the compiler. 'OS' needs to be changed by the type of operating system. Currently this can be one of \var{DOS}, \var{GO32V2}, \var{LINUX}, \var{OS2}, \var{WIN32}, \var{MACOS}, \var{AMIGA} or \var{ATARI}. The \var{OS} symbol is undefined if you specify a target that is different from the platform you're compiling on. The \var{-TSomeOS} option on the command line will define the \var{SomeOS} symbol, and will undefine the existing platform symbol\footnote{In versions prior to 0.9.4, this didn't happen, thus making Cross-compiling impossible.}. \begin{FPCltable}{c}{Symbols defined by the compiler.}{Symbols} \hline FPC \\ VER\var{v} \\ VER\var{v}\_\var{r} \\ VER\var{v}\_\var{r}\_\var{p} \\ OS \\ \hline \end{FPCltable} As an example : Version 0.9.1 of the compiler, running on a Linux system, defines the following symbols before reading the command line arguments: \var{FPC}, \var{VER0}, \var{VER0\_9}, \var{VER0\_9\_1} and \var{LINUX}. Specifying \var{-TOS2} on the command-line will undefine the \var{LINUX} symbol, and will define the \var{OS2} symbol. \begin{remark} Symbols, even when they're defined in the interface part of a unit, are not available outside that unit. \end{remark} Except for the Turbo Pascal constructs, from version 0.9.8 and higher, the \fpc compiler also supports a stronger conditional compile mechanism: The \var{\{\$If \}} construct. The prototype of this construct is as follows : \begin{verbatim} {$If expr} CompileTheseLines; {$else} BetterCompileTheseLines; {$endif} \end{verbatim} In this directive \var{expr} is a Pascal expression which is evaluated using strings, unless both parts of a comparision can be evaluated as numbers, in which case they are evaluated using numbers\footnote{Otherwise \var{\{\$If 8>54\}} would evaluate to \var{True}}. If the complete expression evaluates to \var{'0'}, then it is considered false and rejected. Otherwise it is considered true and accepted. This may have unexpected consequences : \begin{verbatim} {$If 0} \end{verbatim} Will evaluate to \var{False} and be rejected, while \begin{verbatim} {$If 00} \end{verbatim} Will evaluate to \var{True}. You can use any Pascal operator to construct your expression : \var{=, <>, >, <, >=, <=, AND, NOT, OR} and you can use round brackets to change the precedence of the operators. The following example shows you many of the possibilities: \begin{verbatim} {$ifdef fpc} var y : longint; {$else fpc} var z : longint; {$endif fpc} var x : longint; begin {$if (fpc_version=0) and (fpc_release>6) and (fpc_patch>4)} {$info At least this is version 0.9.5} {$else} {$fatal Problem with version check} {$endif} {$define x:=1234} {$if x=1234} {$info x=1234} {$else} {$fatal x should be 1234} {$endif} {$if 12asdf and 12asdf} {$info $if 12asdf and 12asdf is ok} {$else} {$fatal $if 12asdf and 12asdf rejected} {$endif} {$if 0 or 1} {$info $if 0 or 1 is ok} {$else} {$fatal $if 0 or 1 rejected} {$endif} {$if 0} {$fatal $if 0 accepted} {$else} {$info $if 0 is ok} {$endif} {$if 12=12} {$info $if 12=12 is ok} {$else} {$fatal $if 12=12 rejected} {$endif} {$if 12<>312} {$info $if 12<>312 is ok} {$else} {$fatal $if 12<>312 rejected} {$endif} {$if 12<=312} {$info $if 12<=312 is ok} {$else} {$fatal $if 12<=312 rejected} {$endif} {$if 12<312} {$info $if 12<312 is ok} {$else} {$fatal $if 12<312 rejected} {$endif} {$if a12=a12} {$info $if a12=a12 is ok} {$else} {$fatal $if a12=a12 rejected} {$endif} {$if a12<=z312} {$info $if a12<=z312 is ok} {$else} {$fatal $if a12<=z312 rejected} {$endif} {$if a12$7fff becomes $ffff) } audio1:=(audio1+helpdata2)-helpdata2; {$saturation-} { now mupltily with 2 and change to integer } audio1:=(audio1 shl 1)-helpdata2; {$mmx-} end. \end{verbatim} \section{Restrictions of MMX support} \label{se:MMXrestrictions} In the beginning of 1997 the MMX instructions were introduced in the Pentium processors, so multitasking systems wouldn't save the newly introduced MMX registers. To work around that problem, Intel mapped the MMX registers to the FPU register. The consequence is that you can't mix MMX and floating point operations. After using MMX operations and before using floating point operations, you have to call the routine \var{EMMS} of the \var{MMX} unit. This routine restores the FPU registers. {\em Careful:} The compiler doesn't warn if you mix floating point and MMX operations, so be careful. The MMX instructions are optimized for multi media (what else?). So it isn't possible to perform each operation, some opertions give a type mismatch, see section \ref {se:SupportedMMX} for the supported MMX operations An important restriction is that MMX operations aren't range or overflow checked, even when you turn range and overflow checking on. This is due to the nature of MMX operations. The \var{MMX} unit must always be used when doing MMX operations because the exit code of this unit clears the MMX unit. If it wouldn't do that, other program will crash. A consequence of this is that you can't use MMX operations in the exit code of your units or programs, since they would interfere with the exit code of the \var{MMX} unit. The compiler can't check this, so you are responsible for this ! \section{Supported MMX operations} \label{se:SupportedMMX} {\em Still to be written...} \section{Optimizing MMX support} \label{se:OptimizingMMX} Here are some helpful hints to get optimal performance: \begin{itemize} \item The \var{EMMS} call takes a lot of time, so try to seperate floating point and MMX operations. \item Use MMX only in low level routines because the compiler saves all used MMX registers when calling a subroutine. \item The NOT-operator isn't supported natively by MMX, so the compiler has to generate a workaround and this operation is inefficient. \item Simple assignements of floating point numbers don't access floating point registers, so you need no call to the \var{EMMS} procedure. Only when doing arithmetic, you need to call the \var{EMMS} procedure. \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Memory issues %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Memory issues} \label{ch:Memory} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % The 32-bit model \section{The 32-bit model.} \label{se:ThirtytwoBit} The \fpc compiler issues 32-bit code. This has several consequences: \begin{itemize} \item You need a 386 processor to run the generated code. The compiler functions on a 286 when you compile it using Turbo Pascal, but the generated programs cannot be assembled or executed. \item You don't need to bother with segment selectors. Memory can be addressed using a single 32-bit pointer. The amount of memory is limited only by the available amount of (virtual) memory on your machine. \item The structures you define are unlimited in size. Arrays can be as long as you want. You can request memory blocks from any size. \end{itemize} The fact that 32-bit code is used, means that some of the older Turbo Pascal constructs and functions are obsolete. The following is a list of functions which shouldn't be used anymore: \begin{description} \item [Seg()] : Returned the segment of a memory address. Since segments have no more meaning, zero is returned in the \fpc run-time library implementation of \var{Seg}. \item [Ofs()] : Returned the offset of a memory address. Since segments have no more meaning, the complete address is returned in the \fpc implementation of this function. This has as a consequence that the return type is \var{Longint} instead of \var{Word}. \item [Cseg(), Dseg()] : Returned, respectively, the code and data segments of your program. This returns zero in the \fpc implementation of the system unit, since both code and data are in the same memory space. \item [Ptr:] Accepted a segment and offset from an address, and would return a pointer to this address. This has been changed in the run-time library. Standard it returns now simply the offset. If you want to retain the old functionality, you can recompile the run-time library with the \var{DoMapping} symbol defined. This will restore the Turbo Pascal behaviour. \item [memw and mem] These arrays gave access to the \dos memory. \fpc supports them on the go32v2 platform, they are mapped into \dos memory space. You need the \var{GO32} unit for this. On other platforms, they are {\em not} supported \end{description} You shouldn't use these functions, since they are very non-portable, they're specific to \dos and the ix86 processor. The \fpc compiler is designed to be portable to other platforms, so you should keep your code as portable as possible, and not system specific. That is, unless you're writing some driver units, of course. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % The stack \section{The stack} \label{se:Stack} The stack is used to pass parameters to procedures or functions, to store local variables, and, in some cases, to return function results. When a function or procedure is called, then the following is done by the compiler : \begin{enumerate} \item If there are any parameters to be passed to the procedure, they are pushed from right to left on the stack. \item If a function is called that returns a variable of type \var{String}, \var{Set}, \var{Record}, \var{Object} or \var{Array}, then an address to store the function result in, is pushed on the stack. \item If the called procedure or function is an object method, then the pointer to \var{self} is pushed on the stack. \item If the procedure or function is nested in another function or procedure, then the frame pointer of the parent procedure is pushed on the stack. \item The return address is pushed on the stack (This is done automatically by the instruction which calls the subroutine). \end{enumerate} The resulting stack frame upon entering looks as in \seet{StackFrame}. \begin{FPCltable}{llc}{Stack frame when calling a procedure}{StackFrame} \hline Offset & What is stored & Optional ? \\ \hline +x & parameters & Yes \\ +12 & function result & Yes \\ +8 & self & Yes \\ +4 & Frame pointer of parent procedure & Yes \\ +0 & Return address & No\\ \hline \end{FPCltable} \subsection{ Intel x86 version } The stack is cleared with the \var{ret} I386 instruction, meaning that the size of all pushed parameters is limited to 64K. \subsubsection{ DOS } Under the DOS targets, the default stack is set to 256Kb. This value cannot be modified for the GO32V1 target. But this can be modified with the GO32V2 target using a special DJGPP utility \var{stubedit}. It is to note that the stack size may be changed with some compiler switches, this stack size, if \emph{greater} then the default stack size will be used instead, otherwise the default stack size is used. \subsubsection{ Linux } Under Linux, stack size is only limited by the available memory of the system. \subsubsection{ OS/2 } Under OS/2, stack size is determined by one of the runtime environment variables set for EMX. Therefore, the stack size is user defined. \subsection{ Motorola 680x0 version } All depending on the processor target, the stack can be cleared in two manners, if the target processor is a MC68020 or higher, the stack will be cleared with a simple \var{rtd} instruction, meaning that the size of all pushed parameters is limited to 32K. Otherwise on MC68000/68010 processors, the stack clearing mechanism is sligthly more complicated, the exit code will look like this: \begin{verbatim} { move.l (sp)+,a0 add.l paramsize,a0 move.l a0,-(sp) rts } \end{verbatim} \subsubsection{ Amiga } Under AmigaOS, stack size is determined by the user, which sets this value using the stack program. Typical sizes range from 4K to 40K. \subsubsection{ Atari } Under Atari TOS, stack size is currently limited to 8K, and it cannot be modified. This may change in a future release of the compiler. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % The heap \section{The heap} \label{se:Heap} The heap is used to store all dynamic variables, and to store class instances. The interface to the heap is the same as in Turbo Pascal, although the effects are maybe not the same. On top of that, the \fpc run-time library has some extra possibilities, not available in Turbo Pascal. These extra possibilities are explained in the next subsections. % The heap grows \subsection{The heap grows} \fpc supports the \var{HeapError} procedural variable. If this variable is non-nil, then it is called in case you try to allocate memory, and the heap is full. By default, \var{HeapError} points to the \var{GrowHeap} function, which tries to increase the heap. The growheap function issues a system call to try to increase the size of the memory available to your program. It first tries to increase memory in a 1 Mb. chunk. If this fails, it tries to increase the heap by the amount you requested from the heap. If the call to \var{GrowHeap} has failed, then a run-time error is generated, or nil is returned, depending on the \var{GrowHeap} result. If the call to \var{GrowHeap} was successful, then the needed memory will be allocated. % Using Blocks \subsection{Using Blocks} If you need to allocate a lot of small blocks for a small period, then you may want to recompile the run-time library with the \var{USEBLOCKS} symbol defined. If it is recompiled, then the heap management is done in a different way. The run-time library keeps a linked list of allocated blocks with size up to 256 bytes\footnote{The size can be set using the \var{max\_size} constant in the \file{heap.inc} source file.}. By default, it keeps 32 of these lists\footnote{The actual size is \var{max\_size div 8}.}. When a piece of memory in a block is deallocated, the heap manager doesn't really deallocate the occupied memory. The block is simply put in the linked list corresponding to its size. When you then again request a block of memory, the manager checks in the list if there is a non-allocated block which fits the size you need (rounded to 8 bytes). If so, the block is used to allocate the memory you requested. This method of allocating works faster if the heap is very fragmented, and you allocate a lot of small memory chunks. Since it is invisible to the program, this provides an easy way of improving the performance of the heap manager. % The splitheap \subsection{Using the split heap} \begin{remark} The split heap is still somewhat buggy. Use at your own risk for the moment. \end{remark} The split heap can be used to quickly release a lot of blocks you allocated previously. Suppose that in a part of your program, you allocate a lot of memory chunks on the heap. Suppose that you know that you'll release all this memory when this particular part of your program is finished. In Turbo Pascal, you could foresee this, and mark the position of the heap (using the \var{Mark} function) when entering this particular part of your program, and release the occupied memory in one call with the \var{Release} call. For most purposes, this works very good. But sometimes, you may need to allocate something on the heap that you {\em don't} want deallocated when you release the allocated memory. That is where the split heap comes in. When you split the heap, the heap manager keeps 2 heaps: the base heap (the normal heap), and the temporary heap. After the call to split the heap, memory is allocated from the temporary heap. When you're finished using all this memory, you unsplit the heap. This clears all the memory on the split heap with one call. After that, memory will be allocated from the base heap again. So far, nothing special, nothing that can't be done with calls to \var{mark} and \var{release}. Suppose now that you have split the heap, and that you've come to a point where you need to allocate memory that is to stay allocated after you unsplit the heap again. At this point, mark and release are of no use. But when using the split heap, you can tell the heap manager to --temporarily-- use the base heap again to allocate memory. When you've allocated the needed memory, you can tell the heap manager that it should start using the temporary heap again. When you're finished using the temporary heap, you release it, and the memory you allocated on the base heap will still be allocated. To use the split-heap, you must recompile the run-time library with the \var{TempHeap} symbol defined. This means that the following functions are available : \begin{verbatim} procedure Split_Heap; procedure Switch_To_Base_Heap; procedure Switch_To_Temp_Heap; procedure Switch_Heap; procedure ReleaseTempHeap; procedure GetTempMem(var p : pointer;size : longint); \end{verbatim} \var{Split\_Heap} is used to split the heap. It cannot be called two times in a row, without a call to \var{releasetempheap}. \var{Releasetempheap} completely releases the memory used by the temporary heap. Switching temporarily back to the base heap can be done using the \var{Switch\_To\_Base\_Heap} call, and returning to the temporary heap is done using the \var{Switch\_To\_Temp\_Heap} call. Switching from one to the other without knowing on which one your are right now, can be done using the \var{Switch\_Heap} call, which will split the heap first if needed. A call to \var{GetTempMem} will allocate a memory block on the temporary heap, whatever the current heap is. The current heap after this call will be the temporary heap. Typically, what will appear in your code is the following sequence : \begin{verbatim} Split_Heap ... { Memory allocation } ... { !! non-volatile memory needed !!} Switch_To_Base_Heap; getmem (P,size); Switch_To_Temp_Heap; ... {Memory allocation} ... ReleaseTempHeap; {All allocated memory is now freed, except for the memory pointed to by 'P' } ... \end{verbatim} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Debugging the heap \subsection{Debugging the heap} \fpc provides a unit that allows you to trace allocation and deallocation of heap memory: \file{heaptrc}. If you specify the \var{-gh} switch on the command-line, or if you include \var{heaptrc} as the first unit in your uses clause, the memory manager will trace what is allocated and deallocated, and on exit of your program, a summary will be sent to standard output. More information on using the \var{heaptrc} mechanism can be found in the \userref and \unitsref. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Writing your own memory manager. \subsection{Writing your own memory manager} \fpc allows you to write and use your own memory manager. The standard functions \var{GetMem}, \var{FreeMem}, \var{ReallocMem} and \var{Maxavail} use a special record in the system unit to do the actual memory management. The system unit initializes this record with the system unit's own memory manager, but you can read and set this record using the \var{GetMemoryManager} and \var{SetMemoryManager} calls: \begin{verbatim} procedure GetMemoryManager(var MemMgr: TMemoryManager); procedure SetMemoryManager(const MemMgr: TMemoryManager); \end{verbatim} the \var{TMemoryManager} record is defined as follows: \begin{verbatim} TMemoryManager = record Getmem : Function(Size:Longint):Pointer; Freemem : Function(var p:pointer):Longint; FreememSize : Function(var p:pointer;Size:Longint):Longint; AllocMem : Function(Size:longint):Pointer; ReAllocMem : Function(var p:pointer;Size:longint):Pointer; MemSize : function(p:pointer):Longint; MemAvail : Function:Longint; MaxAvail : Function:Longint; HeapSize : Function:Longint; end; \end{verbatim} As you can see, the elements of this record are procedural variables. The system unit does nothing but call these various variables when you allocate or deallocate memory. Each of these functions corresponds to the corresponding call in the system unit. We'll describe each one of them: \begin{description} \item[Getmem] This function allocates a new block on the heap. The block should be \var{Size} bytes long. The return value is a pointer to the newly allocated block. \item[Freemem] should release a previously allocated block. The pointer \var{P} points to a previously allocated block. The Memory manager should implement a mechanism to determine what the size of the memory block is \footnote{By storing it's size at a negative offset for instance.} The return value is optional, and can be used to return the size of the freed memory. \item[FreememSize] This function should release the memory pointed to by \var{P}. The argument \var{Size} is the expected size of the memory block pointed to by P. This should be disregarded, but can be used to check the behaviour of the program. \item[AllocMem] Is the same as getmem, only the allocated memory should be filled with zeroes before the call returns. \item[ReAllocMem] Should allocate a memory block \var{Size} bytes large, and should fill it with the contents of the memory block pointed to by \var{P}, truncating this to the new size of needed. After that, the memory pointed to by P may be deallocated. The return value is a pointer to the new memory block. \item[MemSize] should return the total amount of memory available for allocation. This function may return zero if the memory manager does not allow to determine this information. \item[MaxAvail] should return the size of the largest block of memory that is still available for allocation. This function may return zero if the memory manager does not allow to determine this information. \item[HeapSize] should return the total size of the heap. This may be zero is the memory manager does not allow to determine this information. \end{description} To implement your own memory manager, it is sufficient to construct such a record and to issue a call to \var{SetMemoryManager}. To avoid conflicts with the system memory manager, setting the memory manager should happen as soon as possible in the initialization of your program, i.e. before any call to \var{getmem} is processed. This means in practice that the unit implementing the memory manager should be the first in the \var{uses} clause of your program or library, since it will then be initialized before all other units (except of the system unit) This also means that it is not possible to use the \file{heaptrc} unit in combination with a custom memory manager, since the \file{heaptrc} unit uses the system memory manager to do all it's allocation. Putting the \file{heaptrc} unit after the unit implementing the memory manager would overwrite the memory manager record installed by the custom memory manager, and vice versa. The following unit shows a straightforward implementation of a custom memory manager using the memory manager of the \var{C} library. It is distributed as a package with \fpc. \begin{verbatim} unit cmem; {$mode objfpc} interface Function Malloc (Size : Longint) : Pointer;cdecl; external 'c' name 'malloc'; Procedure Free (P : pointer); cdecl; external 'c' name 'free'; Procedure FreeMem (P : Pointer); cdecl; external 'c' name 'free'; function ReAlloc (P : Pointer; Size : longint) : pointer; cdecl; external 'c' name 'realloc'; Function CAlloc (unitSize,UnitCount : Longint) : pointer;cdecl; external 'c' name 'calloc'; implementation Function CGetMem (Size : Longint) : Pointer; begin result:=Malloc(Size); end; Function CFreeMem (Var P : pointer) : Longint; begin Free(P); Result:=0; end; Function CFreeMemSize(var p:pointer;Size:Longint):Longint; begin Result:=CFreeMem(P); end; Function CAllocMem(Size : Longint) : Pointer; begin Result:=calloc(Size,1); end; Function CReAllocMem (var p:pointer;Size:longint):Pointer; begin Result:=realloc(p,size); end; Function CMemSize (p:pointer): Longint; begin Result:=0; end; Function CMemAvail : Longint; begin Result:=0; end; Function CMaxAvail: Longint; begin Result:=0; end; Function CHeapSize : Longint; begin Result:=0; end; Const CMemoryManager : TMemoryManager = ( GetMem : CGetmem; FreeMem : CFreeMem; FreememSize : CFreememSize; AllocMem : CAllocMem; ReallocMem : CReAllocMem; MemSize : CMemSize; MemAvail : CMemAvail; MaxAvail : MaxAvail; HeapSize : CHeapSize; ); Var OldMemoryManager : TMemoryManager; Initialization GetMemoryManager (OldMemoryManager); SetMemoryManager (CmemoryManager); Finalization SetMemoryManager (OldMemoryManager); end. \end{verbatim} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Accessing DOS memory under the GO32 extender \section{Using \dos memory under the Go32 extender} \label{se:AccessingDosMemory} Because \fpc is a 32 bit compiler, and uses a \dos extender, accessing DOS memory isn't trivial. What follows is an attempt to an explanation of how to access and use \dos or real mode memory\footnote{Thanks to an explanation of Thomas schatzl (E-mail:\var{tom\_at\_work@geocities.com}).}. In {\em Proteced Mode}, memory is accessed through {\em Selectors} and {\em Offsets}. You can think of Selectors as the protected mode equivalents of segments. In \fpc, a pointer is an offset into the \var{DS} selector, which points to the Data of your program. To access the (real mode) \dos memory, somehow you need a selector that points to the \dos memory. The \file{GO32} unit provides you with such a selector: The \var{DosMemSelector} variable, as it is conveniently called. You can also allocate memory in \dos's memory space, using the \var{global\_dos\_alloc} function of the \file{GO32} unit. This function will allocate memory in a place where \dos sees it. As an example, here is a function that returns memory in real mode \dos and returns a selector:offset pair for it. \begin{verbatim} procedure dosalloc(var selector : word; var segment : word; size : longint); var result : longint; begin result := global_dos_alloc(size); selector := word(result); segment := word(result shr 16); end; \end{verbatim} (You need to free this memory using the \var{global\_dos\_free} function.) You can access any place in memory using a selector. You can get a selector using the \var{allocate\_ldt\_descriptor} function, and then let this selector point to the physical memory you want using the \var{set\_segment\_base\_address} function, and set its length using \var{set\_segment\_limit} function. You can manipulate the memory pointed to by the selector using the functions of the GO32 unit. For instance with the \var{seg\_fillchar} function. After using the selector, you must free it again using the \var{free\_ldt\_selector} function. More information on all this can be found in the \unitsref, the chapter on the \file{GO32} unit. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Resource strings %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Resource strings} \label{resourcestrings} \section{Introduction} Resource strings primarily exist to make internationalization of applications easier, by introducing a language construct that provides a uniform way of handling constant strings. Most applications communicate with the user through some messages on the graphical screen or console. Storing these messages in special constants allows to store them in a uniform way in separate files, which can be used for translation. A programmers interface exists to manipulate the actual values of the constant strings at runtime, and a utility tool comes with the Free Pascal compiler to convert the resource string files to whatever format is wanted by the programmer. Both these things are discussed in the following sections. \section{The resource string file} When a unit is compiled that contains a \var{resourcestring} section, the compiler does 2 things: \begin{enumerate} \item It generates a table that contains the value of the strings as it is declared in the sources. \item It generates a {\em resource string file} that contains the names of all strings, together with their declared values. \end{enumerate} This approach has 2 advantages: first of all, the value of the string is always present in the program. If the programmer doesn't care to translate the strings, the default values are always present in the binary. This also avoids having to provide a file containing the strings. Secondly, having all strings together in a compiler generated file ensures that all strings are together (you can have multiple resourcestring sections in 1 unit or program) and having this file in a fixed format, allows the programmer to choose his way of internationalization. For each unit that is compiled and that contains a resourcestring section, the compiler generates a file that has the name of the unit, and an extension \file{.rst}. The format of this file is as follows: \begin{enumerate} \item An empty line. \item A line starting with a hash sign (\var{\#}) and the hash value of the string, preceded by the text \var{hash value =}. \item A third line, containing the name of the resource string in the format \var{unitname.constantname}, all lowercase, followed by an equal sign, and the string value, in a format equal to the pascal representation of this string. The line may be continued on the next line, in that case it reads as a pascal string expression with a plus sign in it. \item Another empty line. \end{enumerate} If the unit contains no \var{resourcestring} section, no file is generated. For example, the following unit: \begin{verbatim} unit rsdemo; {$mode delphi} {$H+} interface resourcestring First = 'First'; Second = 'A Second very long string that should cover more than 1 line'; implementation end. \end{verbatim} Will result in the following resource string file: \begin{verbatim} # hash value = 5048740 rsdemo.first='First' # hash value = 171989989 rsdemo.second='A Second very long string that should cover more than 1 li'+ 'ne' \end{verbatim} The hash value is calculated with the function \var{Hash}. It is present in the \file{objpas} unit. The value is the same value that the GNU gettext mechanism uses. It is in no way unique, and can only be used to speed up searches. The \file{rstconv} utility that comes with the \fpc compiler allows to manipulate these resource string files. At the moment, it can only be used to make a \file{.po} file that can be fed to the GNU \file{msgfmt} program. If someone wishes to have another format (Win32 resource files spring to mind) he/she can enhance the \file{rstconv} program so it can generate other types of files as well. GNU gettext was chosen because it is available on all platforms, and is already widely used in the \var{Unix} and free software community. Since the \fpc team doesn't want to restrict the use of resource strings, the \file{.rst} format was chosen to provide a neutral method, not restricted to any tool. If you use resource strings in your units, and you want people to be able to translate the strings, you must provide the resource string file. Currently, there is no way to extract them from the unit file, though this is in principle possible. It is not required to do this, the program can be compiled without it, but then the translation of the strings isn't possible. \section{Updating the string tables} Having compiled a program with resourcestrings is not enough to internationalize your program. At run-time, the program must initialize the string tables with the correct values for the anguage that the user selected. By default no such initialization is performed. All strings are initialized with their declared values. The \file{objpas} unit provides the mechanism to correctly initialize the string tables. There is no need to include this unit in a \var{uses} clause, since it is automatically loaded when a program or unit is compiled in \var{Delphi} or \var{objfpc} mode. Since this is required to use resource strings, the unit is always loaded when needed. The resource strings are stored in tables, one per unit, and one for the program, if it contains a \var{resourcestring} section as well. Each resourcestring is stored with it's name, hash value, default value, and the current value, all as \var{AnsiStrings}. The objpas unit offers methods to retrieve the number of resourcestring tables, the number of strings per table, and the above information for each string. It also offers a method to set the current value of the strings. Here are the declarations of all the functions: \begin{verbatim} Function ResourceStringTableCount : Longint; Function ResourceStringCount(TableIndex : longint) : longint; Function GetResourceStringName(TableIndex, StringIndex : Longint) : Ansistring; Function GetResourceStringHash(TableIndex, StringIndex : Longint) : Longint; Function GetResourceStringDefaultValue(TableIndex, StringIndex : Longint) : AnsiString; Function GetResourceStringCurrentValue(TableIndex, StringIndex : Longint) : AnsiString; Function SetResourceStringValue(TableIndex, StringIndex : longint; Value : Ansistring) : Boolean; Procedure SetResourceStrings (SetFunction : TResourceIterator); \end{verbatim} Two other function exist, for convenience only: \begin{verbatim} Function Hash(S : AnsiString) : longint; Procedure ResetResourceTables; \end{verbatim} Here is a short explanation of what each function does. A more detailed explanation of the functions can be found in the \refref. \begin{description} \item[ResourceStringTableCount] returns the number of resource string tables in the program. \item[ResourceStringCount] returns the number of resource string entries in a given table (tables are denoted by a zero-based index). \item[GetResourceStringName] returns the name of a resource string in a resource table. This is the name of the unit, a dot (.) and the name of the string constant, all in lowercase. The strings are denoted by index, also zero-based. \item[GetResourceStringHash] returns the hash value of a resource string, as calculated by the compiler with the \var{Hash} function. \item[GetResourceStringDefaultValue] returns the default value of a resource string, i.e. the value that appears in the resource string declaration, and that is stored in the binary. \item[GetResourceStringCurrentValue] returns the current value of a resource string, i.e. the value set by the initialization (the default value), or the value set by some previous internationalization routine. \item[SetResourceStringValue] sets the current value of a resource string. This function must be called to initialize all strings. \item[SetResourceStrings] giving this function a callback will cause the calback to be called for all resource strings, one by one, and set the value of the string to the return value of the callback. \end{description} Two other functions exist, for convenience only: \begin{description} \item [Hash] can be used to calculate the hash value of a string. The hash value stored in the tables is the result of this function, applied on the default value. That value is calculated at compile time by the compiler. \item[ResetResourceTables] will reset all the resource strings to their default values. It is called by the initialization code of the objpas unit. \end{description} Given some \var{Translate} function, the following code would initialize all resource strings: \begin{verbatim} Var I,J : Longint; S : AnsiString; begin For I:=0 to ResourceStringTableCount-1 do For J:=0 to ResourceStringCount(i)-1 do begin S:=Translate(GetResourceStringDefaultValue(I,J)); SetResourceStringValue(I,J,S); end; end; \end{verbatim} Other methods are of course possible, and the \var{Translate} function can be implemented in a variety of ways. \section{GNU gettext} The unit \file{gettext} provides a way to internationalize an application with the GNU \file{gettext} utilities. This unit is supplied with the Free Component Library (FCL). it can be used as follows: for a given application, the following steps must be followed: \begin{enumerate} \item Collect all resource string files and concatenate them together. \item Invoke the \file{rstconv} program with the file resulting out of step 1, resulting in a single \file{.po} file containing all resource strings of the program. \item Translate the \file{.po} file of step 2 in all required languages. \item Run the \file{msgfmt} formatting program on all the \file{.po} files, resulting in a set of \file{.mo} files, which can be distributed with your application. \item Call the \file{gettext} unit's \var{TranslateReosurceStrings} method, giving it a template for the location of the \file{.mo} files, e.g. as in \begin{verbatim} TranslateResourcestrings('intl/restest.%s.mo'); \end{verbatim} the \var{\%s} specifier will be replaced by the contents of the \var{LANG} environment variable. This call should happen at program startup. \end{enumerate} An example program exists in the FCL sources, in the \file{fcl/tests} directory. \section{Caveat} In principle it is possible to translate all resource strings at any time in a running program. However, this change is not communicated to other strings; its change is noticed only when a constant string is being used. Consider the following example: \begin{verbatim} Const help = 'With a little help of a programmer.'; Var A : AnsiString; begin { lots of code } A:=Help; { Again some code} TranslateStrings; { More code } \end{verbatim} After the call to \var{TranslateStrings}, the value of \var{A} will remain unchanged. This means that the assignment \var{A:=Help} must be executed again in order for the change to become visible. This is important, especially for GUI programs which have e.g. a menu. In order for the change in resource strings to become visible, the new values must be reloaded by program code into the menus... %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Optimizations done in the compiler %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Optimizations} \section{ Non processor specific } The following sections describe the general optimizations done by the compiler, they are not processor specific. Some of these require some compiler switch override while others are done automatically (those which require a switch will be noted as such). \subsection{ Constant folding } In \fpc, if the operand(s) of an operator are constants, they will be evaluated at compile time. Example \begin{verbatim} x:=1+2+3+6+5; will generate the same code as x:=17; \end{verbatim} Furthermore, if an array index is a constant, the offset will be evaluated at compile time. This means that accessing MyData[5] is as efficient as accessing a normal variable. Finally, calling \var{Chr}, \var{Hi}, \var{Lo}, \var{Ord}, \var{Pred}, or \var{Succ} functions with constant parameters generates no run-time library calls, instead, the values are evaluated at compile time. \subsection{ Constant merging } Using the same constant string two or more times generates only one copy of the string constant. \subsection{ Short cut evaluation } Evaluation of boolean expression stops as soon as the result is known, which makes code execute faster then if all boolean operands were evaluated. \subsection{ Constant set inlining } Using the \var{in} operator is always more efficient then using the equivalent \verb|<>|, \verb|=|, \verb|<=|, \verb|>=|, \verb|<| and \verb|>| operators. This is because range comparisons can be done more easily with \var{in} then with normal comparison operators. \subsection{ Small sets } Sets which contain less then 33 elements can be directly encoded using a 32-bit value, therefore no run-time library calls to evaluate operands on these sets are required; they are directly encoded by the code generator. \subsection{ Range checking } Assignments of constants to variables are range checked at compile time, which removes the need of the generation of runtime range checking code. \begin{remark} This feature was not implemented before version 0.99.5 of \fpc. \end{remark} \subsection{ Shifts instead of multiply or divide } When one of the operands in a multiplication is a power of two, they are encoded using arithmetic shift instructions, which generates more efficient code. Similarly, if the divisor in a \var{div} operation is a power of two, it is encoded using arithmetic shift instructions. The same is true when accessing array indexes which are powers of two, the address is calculated using arithmetic shifts instead of the multiply instruction. \subsection{ Automatic alignment } By default all variables larger then a byte are guaranteed to be aligned at least on a word boundary. Furthermore all pointers allocated using the standard runtime library (\var{New} and \var{GetMem} among others) are guaranteed to return pointers aligned on a quadword boundary (64-bit alignment). Alignment of variables on the stack depends on the target processor. \begin{remark} Two facts about alignment: \begin{enumerate} \item Quadword alignment of pointers is not guaranteed on systems which don't use an internal heap, such as for the Win32 target. \item Alignment is also done \emph{between} fields in records, objects and classes, this is \emph{not} the same as in Turbo Pascal and may cause problems when using disk I/O with these types. To get no alignment between fields use the \var{packed} directive or the \var{\{\$PackRecords n\}} switch. For further information, take a look at the reference manual under the \var{record} heading. \end{enumerate} \end{remark} \subsection{Smart linking} This feature removes all unreferenced code in the final executable file, making the executable file much smaller. Smart linking is switched on with the \var{-Cx} command-line switch, or using the \var{\{\$SMARTLINK ON\}} global directive. \begin{remark} Smart linking was implemented starting with version 0.99.6 of \fpc. \end{remark} \subsection{ Inline routines } The following runtime library routines are coded directly into the final executable : \var{Lo}, \var{Hi}, \var{High}, \var{Sizeof}, \var{TypeOf}, \var{Length}, \var{Pred}, \var{Succ}, \var{Inc}, \var{Dec} and \var{Assigned}. \begin{remark} Inline \var{Inc} and \var{Dec} were not completely implemented until version 0.99.6 of \fpc. \end{remark} \subsection{ Case optimization } When using the \var{-O1} (or higher) switch, case statements will be generated using a jump table if appropriate, to make them execute faster. \subsection{ Stack frame omission } Under specific conditions, the stack frame (entry and exit code for the routine, see section \ref{se:Calling}) will be omitted, and the variable will directly be accessed via the stack pointer. Conditions for omission of the stack frame : \begin{itemize} \item The function has no parameters nor local variables. \item Routine does not call other routines. \item Routine does not contain assembler statements. However, a \var{assembler} routine may omit it's stack frame. \item Routine is not declared using the \var{Interrupt} directive. \item Routine is not a constructor or destructor. \end{itemize} \subsection{ Register variables } When using the \var{-Or} switch, local variables or parameters which are used very often will be moved to registers for faster access. \begin{remark} Register variable allocation is currently an experimental feature, and should be used with caution. \end{remark} \subsection{ Intel x86 specific } Here follows a listing of the optimizing techniques used in the compiler: \begin{enumerate} \item When optimizing for a specific Processor (\var{-Op1, -Op2, -Op3}, the following is done: \begin{itemize} \item In \var{case} statements, a check is done whether a jump table or a sequence of conditional jumps should be used for optimal performance. \item Determines a number of strategies when doing peephole optimization, e.g.: \var{movzbl (\%ebp), \%eax} will be changed into \var{xorl \%eax,\%eax; movb (\%ebp),\%al } for Pentium and PentiumMMX. \end{itemize} \item When optimizing for speed (\var{-OG}, the default) or size (\var{-Og}), a choice is made between using shorter instructions (for size) such as \var{enter \$4}, or longer instructions \var{subl \$4,\%esp} for speed. When smaller size is requested, things aren't aligned on 4-byte boundaries. When speed is requested, things are aligned on 4-byte boundaries as much as possible. \item Fast optimizations (\var{-O1}): activate the peephole optimizer \item Slower optimizations (\var{-O2}): also activate the common subexpression elimination (formerly called the "reloading optimizer") \item Uncertain optimizations (\var{-Ou}): With this switch, the common subexpression elimination algorithm can be forced into making uncertain optimizations. Although you can enable uncertain optimizations in most cases, for people who do not understand the following technical explanation, it might be the safest to leave them off. \begin{quote} % Jonas's own words.. \em If uncertain optimizations are enabled, the CSE algortihm assumes that \begin{itemize} \item If something is written to a local/global register or a procedure/function parameter, this value doesn't overwrite the value to which a pointer points. \item If something is written to memory pointed to by a pointer variable, this value doesn't overwrite the value of a local/global variable or a procedure/function parameter. \end{itemize} % end of quote \end{quote} The practical upshot of this is that you cannot use the uncertain optimizations if you both write and read local or global variables directly and through pointers (this includes \var{Var} parameters, as those are pointers too). The following example will produce bad code when you switch on uncertain optimizations: \begin{verbatim} Var temp: Longint; Procedure Foo(Var Bar: Longint); Begin If (Bar = temp) Then Begin Inc(Bar); If (Bar <> temp) then Writeln('bug!') End End; Begin Foo(Temp); End. \end{verbatim} The reason it produces bad code is because you access the global variable \var{Temp} both through its name \var{Temp} and through a pointer, in this case using the \var{Bar} variable parameter, which is nothing but a pointer to \var{Temp} in the above code. On the other hand, you can use the uncertain optimizations if you access global/local variables or parameters through pointers, and {\em only} access them through this pointer\footnote{ You can use multiple pointers to point to the same variable as well, that doesn't matter.}. For example: \begin{verbatim} Type TMyRec = Record a, b: Longint; End; PMyRec = ^TMyRec; TMyRecArray = Array [1..100000] of TMyRec; PMyRecArray = ^TMyRecArray; Var MyRecArrayPtr: PMyRecArray; MyRecPtr: PMyRec; Counter: Longint; Begin New(MyRecArrayPtr); For Counter := 1 to 100000 Do Begin MyRecPtr := @MyRecArrayPtr^[Counter]; MyRecPtr^.a := Counter; MyRecPtr^.b := Counter div 2; End; End. \end{verbatim} Will produce correct code, because the global variable \var{MyRecArrayPtr} is not accessed directly, but only through a pointer (\var{MyRecPtr} in this case). In conclusion, one could say that you can use uncertain optimizations {\em only} when you know what you're doing. \end{enumerate} \subsection{ Motorola 680x0 specific } Using the \var{-O2} switch does several optimizations in the code produced, the most notable being: \begin{itemize} \item Sign extension from byte to long will use \var{EXTB} \item Returning of functions will use \var{RTD} \item Range checking will generate no run-time calls \item Multiplication will use the long \var{MULS} instruction, no runtime library call will be generated \item Division will use the long \var{DIVS} instruction, no runtime library call will be generated \end{itemize} \section{Optimization switches} This is where the various optimizing switches and their actions are described, grouped per switch. \begin{description} \item [-On:\ ] with n = 1..3: these switches activate the optimizer. A higher level automatically includes all lower levels. \begin{itemize} \item Level 1 (\var{-O1}) activates the peephole optimizer (common instruction sequences are replaced by faster equivalents). \item Level 2 (\var{-O2}) enables the assembler data flow analyzer, which allows the common subexpression elimination procedure to remove unnecessary reloads of registers with values they already contain. \item Level 3 (\var{-O3}) enables uncertain optimizations. For more info, see -Ou. \end{itemize} \item[-OG:\ ] This causes the code generator (and optimizer, IF activated), to favor faster, but code-wise larger, instruction sequences (such as "\verb|subl $4,%esp|") instead of slower, smaller instructions ("\verb|enter $4|"). This is the default setting. \item[-Og:\ ] This one is exactly the reverse of -OG, and as such these switches are mutually exclusive: enabling one will disable the other. \item[-Or:\ ] This setting (once it's fixed) causes the code generator to check which variables are used most, so it can keep those in a register. \item[-Opn:\ ] with n = 1..3: Setting the target processor does NOT activate the optimizer. It merely influences the code generator and, if activated, the optimizer: \begin{itemize} \item During the code generation process, this setting is used to decide whether a jump table or a sequence of successive jumps provides the best performance in a case statement. \item The peephole optimizer takes a number of decisions based on this setting, for example it translates certain complex instructions, such as \begin{verbatim} movzbl (mem), %eax| \end{verbatim} to a combination of simpler instructions \begin{verbatim} xorl %eax, %eax movb (mem), %al \end{verbatim} for the Pentium. \end{itemize} \item[-Ou:\ ] This enables uncertain optimizations. You cannot use these always, however. The previous section explains when they can be used, and when they cannot be used. \end{description} \section{Tips to get faster code} Here, some general tips for getting better code are presented. They mainly concern coding style. \begin{itemize} \item Find a better algorithm. No matter how much you and the compiler tweak the code, a quicksort will (almost) always outperform a bubble sort, for example. \item Use variables of the native size of the processor you're writing for. For the 80x86 and compatibles, this is 32 bit, so you're best of using longint and cardinal variables. \item Turn on the optimizer. \item Write your if/then/else statements so that the code in the "then"-part gets executed most of the time (improves the rate of successful jump prediction). \item If you are allocating and disposing a lot of small memory blocks, check out the heapblocks variable (heapblocks are on by default from release 0.99.8 and later) \item Profile your code (see the -pg switch) to find out where the bottlenecks are. If you want, you can rewrite those parts in assembler. You can take the code generated by the compiler as a starting point. When given the \var{-a} command-line switch, the compiler will not erase the assembler file at the end of the assembly process, so you can study the assembler file. {\em Note:} Code blocks which contain an assembler block, are not processed at all by the optimizer at this time. Update: as of version 0.99.11, the Pascal code surrounding the assembler blocks is optimized. \end{itemize} \section{ Floating point } This is where can be found processor specific information on floating point code generated by the compiler. \subsection{ Intel x86 specific } All normal floating point types map to their real type, including \var{comp} and \var{extended}. \subsection{ Motorola 680x0 specific } Early generations of the Motorola 680x0 processors did not have integrated floating point units, so to circumvent this fact, all floating point operations are emulated (with the \var{\$E+} switch, which is the default) using the IEEE \var{Single} floating point type. In other words when emulation is on, Real, Single, Double and Extended all map to the \var{single} floating point type. When the \var{\$E} switch is turned off, normal 68882/68881/68040 floating point opcodes are emitted. The Real type still maps to \var{Single} but the other types map to their true floating point types. Only basic FPU opcodes are used, which means that it can work on 68040 processors correctly. \begin{remark} \var{Double} and \var{Extended} types in true floating point mode have not been extensively tested as of version 0.99.5. \end{remark} \begin{remark} The \var{comp} data type is currently not supported. \end{remark} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % using resources %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Using Windows resources} \label{ch:windres} \section{The resource directive \var{\$R}} Under \windows, you can include resources in your executable or library using the \var{\{\$R filename\}} directive. These resources can then be accessed through the standard windows API calls. When the compiler encounters a resource directive, it just creates an entry in the unit \file{.ppu} file; it doesn't link the resource. Only when it creates a library or executable, it looks for all the resource files for which it encountered a directive, and tries to link them in. The default extension for resource files is \file{.res}. When the filename has as the first character an asterix (\var{*}), the compiler will replace the asterix with the name of the current unit, library or program. \begin{remark} This means that the asterix may only be used after a \var{unit}, \var{library} or \var{program} clause. \end{remark} \section{Creating resources} The \fpc compiler itself doesn't create any resource files; it just compiles them into the executable. To create resource files, you can use some GUI tools as the Borland resource workshop; but it is also possible to use a windows resource compiler like \gnu \file{windres}. \file{windres} comes with the \gnu binutils, but the \fpc distribution also contains a version which you can use. The usage of windres is straightforward; it reads an input file describing the resources to create and outputs a resource file. A typical invocation of \file{windres} would be \begin{verbatim} windres -i mystrings.rc -o mystrings.res \end{verbatim} this will read the \file{mystrings.rc} file and output a \file{mystrings.res} resource file. A complete overview of the windres tools is outside the scope of this document, but here are some things you can use it for: \begin{description} \item[stringtables] that contain lists of strings. \item[bitmaps] which are read from an external file. \item[icons] which are also read from an external file. \item[Version information] which can be viewed with the Windows explorer. \item[Menus] Can be designed as resources and used in your GUI applications. \item[Arbitrary data] Can be included as resources and read with the windows API calls. \end{description} Some of these will be described below. \section{Using string tables.} String tables can be used to store and retrieve large collections of strings in your application. A string table looks as follows: \begin{verbatim} STRINGTABLE { 1, "hello World !" 2, "hello world again !" 3, "last hello world !" } \end{verbatim} You can compile this (we assume the file is called \file{tests.rc}) as follows: \begin{verbatim} windres -i tests.rc -o tests.res \end{verbatim} And this is the way to retrieve the strings from your program: \begin{verbatim} program tests; {$mode objfpc} Uses Windows; {$R *.res} Function LoadResourceString (Index : longint): Shortstring; begin SetLength(Result,LoadString(FindResource(0,Nil,RT_STRING),Index,@Result[1],SizeOf(Result))) end; Var I: longint; begin For i:=1 to 3 do Writeln (Loadresourcestring(I)); end. \end{verbatim} The call to \var{FindResource} searches for the stringtable in the compiled-in resources. The \var{LoadString} function then reads the string with index \var{i} out of the table, and puts it in a buffer, which can then be used. Both calls are in the windows unit. \section{Inserting version information} The win32 API allows to store version information in your binaries. This information can be made visible with the \windows Explorer, by right-clicking on the executable or library, and selecting the 'Properties' menu. In the tab 'Version' the version information will be displayed. Here is how to insert version information in your binary: \begin{verbatim} 1 VERSIONINFO FILEVERSION 4, 0, 3, 17 PRODUCTVERSION 3, 0, 0, 0 FILEFLAGSMASK 0 FILEOS 0x40000 FILETYPE 1 { BLOCK "StringFileInfo" { BLOCK "040904E4" { VALUE "CompanyName", "Free Pascal" VALUE "FileDescription", "Free Pascal version information extractor" VALUE "FileVersion", "1.0" VALUE "InternalName", "Showver" VALUE "LegalCopyright", "GNU Public License" VALUE "OriginalFilename", "showver.pp" VALUE "ProductName", "Free Pascal" VALUE "ProductVersion", "1.0" } } } \end{verbatim} As you can see, you can insert various kinds of information in the version info block. The keyword \var{VERSIONINFO} marks the beginning of the version information resource block. The keywords \var{FILEVERSION}, \var{PRODUCTVERSION} give the actual file version, while the block \var{StringFileInfo} gives other information that is displayed in the explorer. The Free Component Library comes with a unit (\file{fileinfo}) that allows to extract and view version information in a straightforward and easy manner; the demo program that comes with it (\file{showver}) shows version information for an arbitrary executable or DLL. \section{Inserting an application icon} When \windows shows an executable in the Explorer, it looks for an icon in the executable to show in front of the filename, the application icon. Inserting an application icon is very easy and can be done as follows \begin{verbatim} AppIcon ICON "filename.ico" \end{verbatim} This will read the file \file{filename.ico} and insert it in the resource file. \section{Using a pascal preprocessor} Sometimes you want to use symbolic names in your resource file, and use the same names in your program to access the resources. To accomplish this, there exists a preprocessor for \file{windres} that understands pascal syntax: \file{fprcp}. This preprocessor is shipped with the \fpc distribution. The idea is that the preprocessor reads a pascal unit that has some symbolic constants defined in it, and replaces symbolic names in the resource file by the values of the constants in the unit: As an example: consider the follwoing unit: \begin{verbatim} unit myunit; interface Const First = 1; Second = 2: Third = 3; Implementation end. \end{verbatim} And the following resource file: \begin{verbatim} #include "myunit.pp" STRINGTABLE { First, "hello World !" Second, "hello world again !" Third, "last hello world !" } \end{verbatim} if you invoke windres with the \var{--preprocessor} option: \begin{verbatim} windres --preprocessor fprcp -i myunit.rc -o myunit.res \end{verbatim} Then the preprocessor will replace the symbolic names 'first', 'second' and 'third' with their actual values. In your program, you can then refer to the strings by their symbolic names (the constants) instead of using a numeric index. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendices %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \appendix %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendix A %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Anatomy of a unit file} \label{ch:AppA} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Basics \section{Basics} The best and most updated documentation about the ppu files can be found in \file{ppu.pas} and \file{ppudump.pp} which can be found in \file{rtl/utils/}. To read or write the ppufile, you can use the ppu unit \file{ppu.pas} which has an object called tppufile which holds all routines that deal with ppufile handling. While describing the layout of a ppufile, the methods which can be used for it are presented as well. A unit file consists of basically five or six parts: \begin{enumerate} \item A unit header. \item A file interface part. \item A definition part. Contains all type and procedure definitions. \item A symbol part. Contains all symbol names and references to their definitions. \item A browser part. Contains all references from this unit to other units and inside this unit. Only available when the \var{uf\_has\_browser} flag is set in the unit flags \item A file implementation part (currently unused). \end{enumerate} \section{reading ppufiles} We will first create an object ppufile which will be used below. We are opening unit \file{test.ppu} as an example. \begin{verbatim} var ppufile : pppufile; begin { Initialize object } ppufile:=new(pppufile,init('test.ppu'); { open the unit and read the header, returns false when it fails } if not ppufile.open then error('error opening unit test.ppu'); { here we can read the unit } { close unit } ppufile.close; { release object } dispose(ppufile,done); end; \end{verbatim} Note: When a function fails (for example not enough bytes left in an entry) it sets the \var{ppufile.error} variable. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % The Header \section{The Header} The header consists of a record containing 24 bytes: \begin{verbatim} tppuheader=packed record id : array[1..3] of char; { = 'PPU' } ver : array[1..3] of char; compiler : word; cpu : word; target : word; flags : longint; size : longint; { size of the ppufile without header } checksum : longint; { checksum for this ppufile } end; \end{verbatim} The header is already read by the \var{ppufile.open} command. You can access all fields using \var{ppufile.header} which holds the current header record. \begin{tabular}{lp{10cm}} \raggedright field & description \\ \hline \var{id} & this is allways 'PPU', can be checked with \mbox{\var{function ppufile.CheckPPUId:boolean;}} \\ \var{ver} & ppu version, currently '015', can be checked with \mbox{\var{function ppufile.GetPPUVersion:longint;}} (returns 15) \\ \var{compiler} & compiler version used to create the unit. Doesn't contain the patchlevel. Currently 0.99 where 0 is the high byte and 99 the low byte \\ \var{cpu} & cpu for which this unit is created. 0 = i386 1 = m68k \\ \var{target} & target for which this unit is created, this depends also on the cpu! For i386: \begin{tabular}[t]{ll} 0 & Go32v1 \\ 1 & Go32V2 \\ 2 & Linux-i386 \\ 3 & OS/2 \\ 4 & Win32 \end{tabular} For m68k: \begin{tabular}[t]{ll} 0 & Amiga \\ 1 & Mac68k \\ 2 & Atari \\ 3 & Linux-m68k \end{tabular} \\ \var{flag} & the unit flags, contains a combination of the uf\_ constants which are definied in \file{ppu.pas} \\ \var{size} & size of this unit without this header \\ \var{checksum} & checksum of the interface parts of this unit, which determine if a unit is changed or not, so other units can see if they need to be recompiled \\ \hline \end{tabular} % The sections \section{The sections} After this header follow the sections. All sections work the same! A section consists of entries and ends also with an entry, but containing the specific \var{ibend} constant (see \file{ppu.pas} for a list of constants). Each entry starts with an entryheader. \begin{verbatim} tppuentry=packed record id : byte; nr : byte; size : longint; end; \end{verbatim} \begin{tabular}{lp{10cm}} field & Description \\ \hline id & this is 1 or 2 and can be checked to see whether the entry is correctly found. 1 means its a main entry, which says that it is part of the basic layout as explained before. 2 means that it it a sub entry of a record or object. \\ nr & contains the ib constant number which determines what kind of entry it is. \\ size & size of this entry without the header, can be used to skip entries very easily. \\ \hline \end{tabular} To read an entry you can simply call \var{ppufile.readentry:byte}, it returns the \var{tppuentry.nr} field, which holds the type of the entry. A common way how this works is (example is for the symbols): \begin{verbatim} repeat b:=ppufile.readentry; case b of ib : begin end; ibendsyms : break; end; until false; \end{verbatim} Then you can parse each entry type yourself. \var{ppufile.readentry} will take care of skipping unread bytes in the entry and reads the next entry correctly! A special function is \var{skipuntilentry(untilb:byte):boolean;} which will read the ppufile until it finds entry \var{untilb} in the main entries. Parsing an entry can be done with \var{ppufile.getxxx} functions. The available functions are: \begin{verbatim} procedure ppufile.getdata(var b;len:longint); function getbyte:byte; function getword:word; function getlongint:longint; function getreal:ppureal; function getstring:string; \end{verbatim} To check if you're at the end of an entry you can use the following function: \begin{verbatim} function EndOfEntry:boolean; \end{verbatim} {\em notes:} \begin{enumerate} \item \var{ppureal} is the best real that exists for the cpu where the unit is created for. Currently it is \var{extended} for i386 and \var{single} for m68k. \item the \var{ibobjectdef} and \var{ibrecorddef} have stored a definition and symbol section for themselves. So you'll need a recursive call. See \file{ppudump.pp} for a correct implementation. \end{enumerate} A complete list of entries and what their fields contain can be found in \file{ppudump.pp}. \section{Creating ppufiles} Creating a new ppufile works almost the same as reading one. First you need to init the object and call create: \begin{verbatim} ppufile:=new(pppufile,init('output.ppu')); ppufile.create; \end{verbatim} After that you can simply write all needed entries. You'll have to take care that you write at least the basic entries for the sections: \begin{verbatim} ibendinterface ibenddefs ibendsyms ibendbrowser (only when you've set uf_has_browser!) ibendimplementation ibend \end{verbatim} Writing an entry is a little different than reading it. You need to first put everything in the entry with ppufile.putxxx: \begin{verbatim} procedure putdata(var b;len:longint); procedure putbyte(b:byte); procedure putword(w:word); procedure putlongint(l:longint); procedure putreal(d:ppureal); procedure putstring(s:string); \end{verbatim} After putting all the things in the entry you need to call \var{ppufile.writeentry(ibnr:byte)} where \var{ibnr} is the entry number you're writing. At the end of the file you need to call \var{ppufile.writeheader} to write the new header to the file. This takes automatically care of the new size of the ppufile. When that is also done you can call \var{ppufile.close} and dispose the object. Extra functions/variables available for writing are: \begin{verbatim} ppufile.NewHeader; ppufile.NewEntry; \end{verbatim} This will give you a clean header or entry. Normally this is called automatically in \var{ppufile.writeentry}, so there should be no need to call these methods. \begin{verbatim} ppufile.flush; \end{verbatim} to flush the current buffers to the disk \begin{verbatim} ppufile.do_crc:boolean; \end{verbatim} set to false if you don't want that the crc is updated, this is necessary if you write for example the browser data. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendix B %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Compiler and RTL source tree structure} \label{ch:AppB} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % The compiler source tree \section{The compiler source tree} All compiler source files are in one directory, normally in \file{source/compiler}. For more informations about the structure of the compiler have a look at the Compiler Manual which contains also some informations about compiler internals. The \file{compiler} directory contains a subdirectory \var{utils}, which contains mainly the utilities for creation and maintainance of the message files. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % The compiler source tree \section{The RTL source tree} The RTL source tree is divided in many subdirectories, but is very structured and easy to understand. It mainly consists of three parts: \begin{enumerate} \item A OS-dependent directory. This contains the files that are different for each operating system. When compiling the RTL, you should do it here. The following directories exist: \begin{itemize} \item \file{atari} for the atari. Not maintained any more. \item \file{amiga} for the amiga. Not maintained any more. \item \file{go32v1} For \dos, using the GO32v1 extender. Not maintained any more. \item \file{go32v2} For \dos, using the GO32v2 extender. \item \file{linux} for \linux platforms. It has two subdirect \item \file{os2} for \ostwo. \item \file{win32} for Win32 platforms. \end{itemize} \item A processor dependent directory. This contains files that are system independent, but processor dependent. It contains mostly optimized routines for a specific processor. The following directories exist: \begin{itemize} \item \file{i386} for the Intel series of processors. \item \file{m68k} for the motorola m68000 series of processors. \end{itemize} \item An OS-independent and Processor independent directory: \file{inc}. This contains complete units, and include files containing interface parts of units. \end{enumerate} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendix C %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Compiler limits} \label{ch:AppC} Although many of the restrictions imposed by the MS-DOS system are removed by use of an extender, or use of another operating system, there still are some limitations to the compiler: \begin{enumerate} \item Procedure or Function definitions can be nested to a level of 32. \item Maximally 255 units can be used in a program when using the real-mode compiler (i.e. a binary that was compiled by Borland Pascal). When using the 32-bit compiler, the limit is set to 1024. You can change this by redefining the \var{maxunits} constant in the \file{files.pas} compiler source file. \end{enumerate} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendix D %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Compiler modes} \label{ch:AppD} Here we list the exact effect of the different compiler modes. They can be set with the \var{\$Mode} switch, or by command line switches. \section{FPC mode} This mode is selected by the \var{{\$MODE FPC}} switch. On the command-line, this means that you use none of the other compatibility mode switches. It is the default mode of the compiler. This means essentially: \begin{enumerate} \item You must use the address operator to assign procedural variables. \item A forward declaration must be repeated exactly the same by the implementation of a function/procedure. In particular, you can not omit the parameters when implementing the function or procedure. \item Overloading of functions is allowed. \item Nested comments are allowed. \item The Objpas unit is NOT loaded. \item You can use the cvar type. \item PChars are converted to strings automatically. \end{enumerate} \section{TP mode} This mode is selected by the \var{{\$MODE TP}} switch. On the command-line, this mode is selected by the \var{-So} switch. \begin{enumerate} \item You cannot use the address operator to assign procedural variables. \item A forward declaration must not be repeated exactly the same by the implementation of a function/procedure. In particular, you can omit the parameters when implementing the function or procedure. \item Overloading of functions is not allowed. \item The Objpas unit is NOT loaded. \item Nested comments are not allowed. \item You can not use the cvar type. \end{enumerate} \section{Delphi mode} This mode is selected by the \var{{\$MODE DELPHI}} switch. On the command-line, this mode is selected by the \var{-Sd} switch. \begin{enumerate} \item You can not use the address operator to assign procedural variables. \item A forward declaration must not be repeated exactly the same by the implementation of a function/procedure. In particular, you not omit the parameters when implementing the function or procedure. \item Overloading of functions is not allowed. \item Nested comments are not allowed. \item The Objpas unit is loaded right after the system unit. One of the consequences of this is that the type \var{Integer} is redefined as \var{Longint}. \end{enumerate} \section{GPC mode} This mode is selected by the \var{{\$MODE GPC}} switch. On the command-line, this mode is selected by the \var{-Sp} switch. \begin{enumerate} \item You must use the address operator to assign procedural variables. \item A forward declaration must not be repeated exactly the same by the implementation of a function/procedure. In particular, you can omit the parameters when implementing the function or procedure. \item Overloading of functions is not allowed. \item The Objpas unit is NOT loaded. \item Nested comments are not allowed. \item You can not use the cvar type. \end{enumerate} \section{OBJFPC mode} This mode is selected by the \var{{\$MODE OBJFPC}} switch. On the command-line, this mode is selected by the \var{-S2} switch. \begin{enumerate} \item You must use the address operator to assign procedural variables. \item A forward declaration must be repeated exactly the same by the implementation of a function/procedure. In particular, you can not omit the parameters when implementing the function or procedure. \item Overloading of functions is allowed. \item Nested comments are allowed. \item The Objpas unit is loaded right after the system unit. One of the consequences of this is that the type \var{Integer} is redefined as \var{Longint}. \item You can use the cvar type. \item PChars are converted to strings automatically. \end{enumerate} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendix E %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Using \file{fpcmake}} \label{ch:makefile} \newcommand{\mvar}[1]{\var{\$(#1)}} \section{Introduction} \fpc comes with a special makefile tool, \file{fpcmake}, which can be used to construct a \file{Makefile} for use with \gnu \file{make}. All sources from the \fpc team are compiled with this system. \file{fpcmake} uses a file \file{Makefile.fpc} and constructs a file \file{Makefile} from it, based on the settings in \file{Makefile.fpc}. The following sections explain what settings can be set in \file{Makefile.fpc}, what variables are set by \var{fpcmake}, what variables it expects to be set, and what targets it defines. After that, some settings in the resulting \file{Makefile} are explained. \section{Usage} \file {fpcmake} reads a \file{Makefile.fpc} and converts it to a \file{Makefile} suitable for reading by \gnu \file{make} to compile your projects. It is similar in functionality to GNU \file{configure} or \file{Imake} for making X projects. \file{fpcmake} accepts filenames of makefile description files as it's command-line arguments. For each of these files it will create a \file{Makefile} in the same directory where the file is located, overwriting any existing file with that name. If no options are given, it just attempts to read the file \file{Makefile.fpc} in the current directory and tries to construct a \file{Makefile} from it. any previously existing \file{Makefile} will be erased. % Makefile.fpc format. \section{Format of the configuration file} This section describes the rules that can be present in the file that is fed to \file{fpcmake}. The file \file{Makefile.fpc} is a plain ASCII file that contains a number of pre-defined sections as in a \windows \file{.ini}-file, or a Samba configuration file. They look more or less as follows: \begin{verbatim} [targets] units=mysql_com mysql_version mysql examples=testdb [dirs] fpcdir=../.. [rules] mysql$(PPUEXT): mysql$(PASEXT) mysql_com$(PPUEXT) testdb$(EXEEXT): testdb$(PASEXT) mysql$(PPUEXT) \end{verbatim} The following sections are recognized (in alphabetical order): \subsection{Clean} Specifies rules for cleaning the directory of units and programs. The following entries are recognized: \begin{description} \item[units] names of all units that should be removed when cleaning. Don't specify extensions, the makefile will append these by itself. \item[files] names of files that should be removed. Specify full filenames. \end{description} \subsection{Defaults} The \var{defaults} section contains some default settings. The following keywords are recognized: \begin{description} \item[defaultdir] \item[defaultbuilddir] \item[defaultinstalldir] \item[defaultzipinstalldir] \item[defaultcleandir] \item[defaultrule] Specifies the default rule to execute. \file{fpcmake} will make sure that this rule is executed if make is executed without arguments, i.e., without an explicit target. \item[defaulttarget] Specifies the default operating system target for which the \file{Makefile} should compile the units and programs. By default this is determined from the default compiler target. \item[defaultcpu] Specifies the default target processor for which the \file{Makefile} should compile the units and programs. By default this is determined from the default compiler processor. \end{description} \subsection{Dirs} In this section you can specify the location of several directories which the \file{Makefile} could need for compiling other packages or for finding the units. The following keywords are recognised: \begin{description} \item[fpcdir] Specifies the directory where all the \fpc source trees reside. Below this directory the \file{Makefile} expects to find the \file{rtl}, \file{fcl} and \file{packages} directory trees. \item[packagedir] Specifies the directory where all the package source directories are. By default this equals \mvar{FPCDIR}\var{/packages}. \item[toolkitdir] Specifies the directory where toolkit source directories are. \item[componentdir] Specifies the directory where component source directories are. \item[unitdir] A colon-separated list of directories that must be added to the unit search path of the compiler. \item[libdir] A colon-separated list of directories that must be added to the library search path of the compiler. \item[objdir] A colon-separated list of directories that must be added to the object file search path of the compiler. \item[targetdir] Specifies the directory where the compiled programs should go. \item[sourcesdir] A space separated list of directories where sources can reside. This will be used for the \var{vpath} setting of \gnu \file{make}. \item[unittargetdir] Specifies the directory where the compiled units should go. \item[incdir] A colon-separated list of directories that must be added to the include file search path of the compiler. \end{description} \subsection{Info} This section can be used to customize the information generating targets that \file{fpcmake} generates. It is simply a series of boolean values that specify whether a certain part of the \var{info} target will be generated. The following keywords are recognised: \begin{description} \item[infoconfig] Specifies whether configuration info should be shown. By default this is \var{True}. \item[infodirs] Specifies whether a list of subdirectories to be treated will be shown. By degault this is \var{False}. \item[infotools] Specifies whether a list of tools that are used by the makefile will be shown. By default this is \var{False}. \item[infoinstall] Specifies whether the installation rules will be shown. By default this is \var{True}. \item[infoobjects] Specifies whether the \file{Makefile} objects will be shown, i.e. a list of all units and programs that will be built by \file{make}. \end{description} \subsection{Install} Contains instructions for installation of your units and programs. The following keywords are recognized: \begin{description} \item[dirprefix] is the directory below wchich all installs are done. This corresponds to the \var{--prefix} argument to \gnu \file{configure}. It is used for the installation of programs and units. By default, this is \file{/usr} on \linux, and \file{/pp} on all other platforms. \item[dirbase] The directory that is used as the base directory for the installation of units. Default this is \var{dirprefix} appended with \var{/lib/fpc/FPC\_VERSION} for \linux or simply the \var{dirprefix} on other platforms. \end{description} Units will be installed in the subdirectory \file{units/\$(OS\_TARGET)} of the \var{dirbase} entry. \subsection{Libs} This section specifies what units should be merged into a library, and what external libraries are needed. It can contain the following keywords: \begin{description} \item[libname] the name of the library that should be created. \item[libunits] a comma-separated list of units that should be moved into one library. \item[needgcclib] a boolean value that specifies whether the \file{gcc} library is needed. This will make sure that the path to the GCC library is inserted in the library search path. \item[needotherlib] (\linux only) a boolean value that tells the makefile that it should add all library directories from the \file{ld.so.conf} file to the compiler command-line. \end{description} \subsection{Packages} Which packages must be used. This section can contain the following keywords: \begin{description} \item[packages] A comma-separated list of packages that are needed to compile the targets. Valid for all platforms. In order to differentiate between platforms, you can prepend the keyword \var{packages} with the OS you are compiling for, e.g. \var{linuxpackages} if you want the makefile to use the listed packages on linux only. \item[fcl] This is a boolean value (0 or 1) that indicates whether the FCL is used. \item[rtl] This is a boolean value (0 or 1) that indicates whether the RTL should be recompiled. \end{description} \subsection{Postsettings} Anything that is in this section will be inserted as-is in the makefile \textit{after} the makefile rules that are generated by fpcmake, but \textit{before} the general configuration rules. In this section, you cannot use variables that are defined by fpcmake rules, but you can define additional rules and configuration variables. \subsection{Presettings} Anything that is in this section will be inserted as-is in the makefile \textit{before} the makefile target rules that are generated by fpcmake. This means that you cannot use any variables that are normally defined by fpcmake rules. \subsection{Rules} In this section you can insert dependency rules and any other targets you wish to have. Do not insert 'default rules' here. \subsection{Sections} Here you can specify which 'rule sections' should be included in the \file{Makefile}. The sections consist of a series of boolean keywords; each keyword decies whether a particular section will be written to the makefile. By default, all sections are written. You can have the following boolean keywords in this section. \begin{description} \item[none] If this is set to true, then no sections are written. \item[units] If set to \var{False}, \file{fpcmake} omits the rules for compiling units. \item[exes] If set to \var{False}, \file{fpcmake} omits the rules for compiling executables. \item[loaders] If set to \var{False}, \file{fpcmake} omits the rules for assembling assembler files. \item[examples] If set to \var{False}, \file{fpcmake} omits the rules for compiling examples. \item[package] If set to \var{False}, \file{fpcmake} omits the rules for making packages. \item[compile] If set to \var{False}, \file{fpcmake} omits the generic rules for compiling pascal files. \item[depend] If set to \var{False}, \file{fpcmake} omits the dependency rules. \item[install] If set to \var{False}, \file{fpcmake} omits the rules for installing everything. \item[sourceinstall] If set to \var{False}, \file{fpcmake} omits the rules for installing the sources. \item[zipinstall] If set to \var{False}, \file{fpcmake} omits the rules for installing archives. \item[clean] If set to \var{False}, \file{fpcmake} omits the rules for cleaning the directories. \item[libs] If set to \var{False}, \file{fpcmake} omits the rules for making libraries. \item[command] If set to \var{False}, \file{fpcmake} omits the rules for composing the command-line based on the various variables. \item[exts] If set to \var{False}, \file{fpcmake} omits the rules for making libraries. \item[dirs] If set to \var{False}, \file{fpcmake} omits the rules for running make in subdirectories.. \item[tools] If set to \var{False}, \file{fpcmake} omits the rules for running some tools as the erchiver, UPX and zip. \item[info] If set to \var{False}, \file{fpcmake} omits the rules for generating information. \end{description} \subsection{Targets} In this section you can define the various targets. The following keywords can be used there: \begin{description} \item[dirs] A space separated list of directories where make should also be run. \item[examples] A space separated list of example programs that need to be compiled when the user asks to compile the examples. Do not specify an extension, the extension will be appended. \item[loaders] A space separated list of names of assembler files that must be assembled. Don't specify the extension, the extension will be appended. \item[programs] A space separated list of program names that need to be compiled. Do not specify an extension, the extension will be appended. \item[rst] a list of \file{rst} files that needs to be converted to \file{.po} files for use with \gnu \file{gettext} and internationalization routines. \item[units] A space separated list of unit names that need to be compiled. Do not specify an extension, just the name of the unit as it would appear un a \var{uses} clause is sufficient. \end{description} \subsection{Tools} In this section you can specify which tools are needed. Definitions to use each of the listed tools will be inserted in the makefile, depending on the setting in this section. Each keyword is a boolean keyword; you can switch the use of a tool on or off with it. The following keywords are recognised: \begin{description} \item[toolppdep] Use \file{ppdep}, the dependency tool. \var{True} by default. \item[toolppumove] Use \file{ppumove}, the Free Pascal unit mover. \var{True} by default. \item[toolppufiles] Use the \file{ppufile} tool to determine dependencies of unit files. \var{True} by default. \item[toolsed] Use \file{sed} the stream line editor. \var{False} by default. \item[tooldata2inc] Use the \file{data2inc} tool to create include files from data files. \var{False} by default. \item[tooldiff] Use the \gnu \file{diff} tool. \var{False} by default. \item[toolcmp] Use the \file{cmp} file comparer tool. \var{False} by default. \item[toolupx] Use the \file{upx} executable packer.\var{True} by default. \item[tooldate] use the \file{date} date displaying tool. \var{True} by default. \item[toolzip] Use the \file{zip} file archiver. This is used by the zip targets. \var{True} by default. \end{description} \subsection{Zip} This section can be used to make zip files from the compiled units and programs. By default all compiled units are zipped. The zip behaviour can be influenced with the presettings and postsettings sections. The following keywords can be used in this unit: \begin{description} \item[zipname] this file is the name of the zip file that will be produced. \item[ziptarget] is the name of a makefile target that will be executed before the zip is made. By default this is the \var{install} target. \end{description} \section{Programs needed to use the generated makefile} The following programs are needed by the generated \file{Makefile} to function correctly: \begin{description} \item[cp] a copy program. \item[date] a program that prints the date. \item[install] a program to install files. \item[make] the \file{make} program, obviously. \item[pwd] a program that prints the current working directory. \item[rm] a program to delete files. \end{description} These are standard programs on linux systems, with the possible exception of \file{make}. For \dos or \windowsnt, they can be found in the file \file{gnuutils.zip} on the \fpc FTP site. The following programs are optionally needed if you use some special targets. Which ones you need are controlled by the settings in the \var{tools} section. \begin{description} \item[cmp] a \dos and \windowsnt file comparer. Used if \var{toolcmp} is \var{True}. \item[diff] a file comparer. Used if \var{tooldiff} is \var{True}. \item[ppdep] the ppdep depency lister. Used if \var{toolppdep} is \var{True}. Distributed with \fpc. \item[ppufiles] the ppufiles unit file dependency lister. Used if \var{toolppufiles} is \var{True}. Distributed with \fpc. \item[ppumove] the \fpc unit mover. Used if \var{toolppumove} is \var{True}. Distributed with \fpc. \item[sed] the \file{sed} program. Used if \var{toolsed} is \var{True}. \item[upx] the UPX executable packer. Used if \var{toolupx} is \var{True}. \item[zip] the zip archiver program. Used if \var{toolzip} is \var{True}. \end{description} All of these can also be found on the \fpc FTP site for \dos and \windowsnt. \file{ppdep,ppufiles} and \file{ppumove} are distributed with the \fpc compiler. % \section{Variables that affect the generated makefile} The makefile generated by \file{fpcmake} contains a lot of variables. Some of them are set in the makefile itself, others can be set and are taken into account when set. These variables can be split in several groups: \begin{itemize} \item Environment variables. \item Directory variables. \item Compiler command-line variables. \end{itemize} Each group will be discussed separately. \subsection{Environment variables} In principle, \var{fpcmake} doesn't expect any environment variable to be set. Optionally, you can set the variable \var{FPCMAKEINI} which should contain the name of a file with the basic rules that \file{fpcmake} will generate. By default, \file{fpcmake} has a compiled-in copy of \file{fpcmake.ini}, which contains the basic rules, so there should be no need to set this variable. You can set it however, if you wish to change the way in which fpcmake works and creates rules. The initial \file{fpcmake.ini} file can be found in the \file{utils} source package on the \fpc ftp site. \subsection{Directory variables} The first set of variables controls the directories that are recognised in the makefile. They should not be set in the \file{Makefile.fpc} file, but can be specified on the commandline. \begin{description} \item[INCDIR] this is a list of directories, separated by spaces, that will be added as include directories to the compiler command-line. Each directory in the list is prepended with \var{-I} and added to the compiler options. \item[LIBDIR] is a list of library paths, separated by spaces. Each directory in the list is prepended with \var{-Fl} and added to the compiler options. \item[OBJDIR] is a list of object file directories, separated by spaces, that is added to the object files path, i.e. Each directory in the list is prepended with \var{-Fo}. \end{description} \subsection{Compiler command-line variables } The following variable can be set on the \file{make} command-line, they will be recognised and integrated in the compiler command-line: \begin{description} \item[OPT] Any options that you want to pass to the compiler. The contents of \var{OPT} is simply added to the compiler command-line. \item[OPTDEF] Are optional defines, added to the command-line of the compiler. They do not get \var{-d} prepended. \end{description} \section{Variables set by \file{fpcmake}} All of the following variables are only set by \file{fpcmake}, if they aren't already defined. This means that you can override them by setting them on the make commandline, or setting them in the \var{presettings} section. But most of them are correctly determined by the generated \file{Makefile} or set by your settings in the configuration file. The following sets of variables are defined: \begin{itemize} \item Directory variables. \item Program names. \item File extensions. \item Target files. \end{itemize} Each of these sets is discussed in the subsequent: \subsection{Directory variables} The following directories are defined by the makefile: \begin{description} \item[BASEDIR] is set to the current directory if the \file{pwd} command is available. If not, it is set to '.'. \item[BASEINSTALLDIR] is the base for all directories where units are installed. By default, On \linux, this is set to \mvar{PREFIXINSTALLDIR}\var{/lib/fpc/}\mvar{RELEASEVER}.\\ On other systems, it is set to \mvar{PREFIXINSTALLDIR}. You can also set it with the \var{basedir} variable in the \var{Install} section. \item[BININSTALLDIR] is set to \mvar{BASEINSTALLDIR}/\var{bin} on \linux, and\\ \mvar{BASEINSTALLDIR}/\var{bin}/\mvar{OS\_TARGET} on other systems. This is the place where binaries are installed. \item[GCCLIBDIR] (\linux only) is set to the directory where \file{libgcc.a} is. If \var{needgcclib} is set to \var{True} in the \var{Libs} section, then this directory is added to the compiler commandline with \var{-Fl}. \item[LIBINSTALLDIR] is set to \mvar{BASEINSTALLDIR} on \linux,\\ and \mvar{BASEINSTALLDIR}/\var{lib} on other systems. \item[NEEDINCDIR] is a space-separated list of library paths. Each directory in the list is prepended with \var{-Fl} and added to the compiler options. Set by the \var{incdir} keyword in the \var{Dirs} section. \item[NEEDLIBDIR] is a space-separated list of library paths. Each directory in the list is prepended with \var{-Fl} and added to the compiler options. Set by the \var{libdir} keyword in the \var{Dirs} section. \item[NEEDOBJDIR] is a list of object file directories, separated by spaces. Each directory in the list is prepended with \var{-Fo} and added to the compiler options. Set by the \var{objdir} keyword in the \var{Dirs} section. \item[NEEDUNITDIR] is a list of unit directories, separated by spaces. Each directory in the list is prepended with \var{-Fu} and is added to the compiler options. Set by the \var{unitdir} keyword in the \var{Dirs} section. \item[TARGETDIR] This directory is added as the output directory of the compiler, where all units and executables are written, i.e. it gets \var{-FE} prepended. It is set by the \var{targtdir} keyword in the \var{Dirs} section. \item[TARGETUNITDIR] If set, this directory is added as the output directory of the compiler, where all units and executables are written, i.e. it gets \var{-FU} prepended.It is set by the \var{targtdir} keyword in the \var{Dirs} section. \item[PREFIXINSTALLDIR] is set to \file{/usr} on \linux, \file{/pp} on \dos or \windowsnt. Set by the \var{dirprefix} keyword in the \var{Install} section. \item[UNITINSTALLDIR] is where units will be installed. This is set to\\ \mvar{BASEINSTALLDIR}/\mvar{UNITPREFIX} \\ on \linux. On other systems, it is set to \\ \mvar{BASEINSTALLDIR}/\mvar{UNITPREFIX}/\mvar{OS\_TARGET}. \end{description} \subsection{Target variables} The second set of variables controls the targets that are constructed by the makefile. They are created by \file{fpcmake}, so you can use them in your rules, but you shouldn't assign values to them yourself. \begin{description} \item[EXEOBJECTS] This is a list of executable names that will be compiled. the makefile appends \mvar{EXEEXT} to these names. It is set by the \var{programs} keyword in the \var{Targets} section. \item[LOADEROBJECTS] is a list of space-separated names that identify loaders to be compiled. This is mainly used in the compiler's RTL sources. It is set by the \var{loaders} keyword in the \var{Targets} section. \item[UNITOBJECTS] This is a list of unit names that will be compiled. The makefile appends \mvar{PPUEXT} to each of these names to form the unit file name. The sourcename is formed by adding \mvar{PASEXT}. It is set by the \var{units} keyword in the \var{Targets} section. \item[ZIPNAME] is the name of the archive that will be created by the makefile. It is set by the \var{zipname} keyword in the \var{Zip} section. \item[ZIPTARGET] is the target that is built before the archive is made. this target is built first. If successful, the zip archive will be made. It is set by the \var{ziptarget} keyword in the \var{Zip} section. \end{description} \subsection{Compiler command-line variables} The following variables control the compiler command-line: \begin{description} \item[CPU\_SOURCE] the target CPU type is added as a define to the compiler command line. This is determined by the Makefile itself. \item[CPU\_TARGET] the target CPU type is added as a define to the compiler command line. This is determined by the Makefile itself. \item[LIBNAME] if a shared library is requested this is the name of the shared library to produce. Don't add \var{lib} to this, the compiler will do that. It is set by the \var{libname} keyword in the \var{Libs} section. \item[NEEDGCCLIB] if this variable is defined, then the path to \file{libgcc} is added to the library path. It is set by the \var{needgcclib} keyword in the \var{Libs} section. \item[NEEDOTHERLIB] (\linux only) If this is defined, then the makefile will append all directories that appear in \var{/etc/ld.so.conf} to the library path. It is set by the \var{needotherlib} keyword in the \var{Libs} section. \item[OS\_TARGET] What platform you want to compile for. Added to the compiler command-line with a \var{-T} prepended. %\item[SMARTLINK] if \var{SMARTLINK} is set to \var{YES} then the compiler %will output smartlinked units if \var{LIBTYPE} is not set to \var{shared}. \end{description} \subsection{Program names} The following variables are program names, used in makefile targets. \begin{description} \item[AS] The assembler. Default set to \file{as}. \item[COPY] a file copy program. Default set to \file{cp -fp}. \item[CMP] a program to compare files. Default set to \var{cmp}. \item[DEL] a file removal program. Default set to \file{rm -f}. \item[DELTREE] a directory removal program. Default set to \file{rm -rf}. \item[DATE] a program to display the date. \item[DIFF] a program to produce diff files. \item[ECHO] an echo program. \item[FPC] the Free Pascal compiler executable. Default set to \var{ppc386.exe} \item[INSTALL] a program to install files. Default set to \file{install -m 644} on linux. \item[INSTALLEXE] a program to install executable files. Default set to \file{install -m 755} on linux. \item[LD] The linker. Default set to \file{ld}. \item[LDCONFIG] (\linux only) the program used to update the loader cache. \item[MKDIR] a program to create directories if they don't exist yet. Default set to \file{install -m 755 -d} \item[MOVE] a file move program. Default set to \file{mv -f} \item[PP] the Free Pascal compiler executable. Default set to \var{ppc386.exe} \item[PPAS] the name of the shell script created by the compiler if the \var{-s} option is specified. This command will be executed after compilation, if the \var{-s} option was detected among the options. \item[PPUMOVE] the program to move units into one big unit library. \item[SED] a stream-line editor program. Default set to \file{sed}. \item[UPX] an executable packer to compress your executables into self-extracting compressed executables. \item[ZIPPROG] a zip program to compress files. zip targets are made with this program \end{description} \subsection{File extensions} The following variables denote extensions of files. These variables include the \var{.} (dot) of the extension. They are appended to object names. \begin{description} \item[ASMEXT] is the extension of assembler files produced by the compiler. \item[LOADEREXT] is the extension of the assembler files that make up the executable startup code. \item[OEXT] is the extension of the object files that the compiler creates. \item[PACKAGESUFFIX] is a suffix that is appended to package names in zip targets. This serves so packages can be made for different OSes. \item[PASEXT] is the extension of pascal files used in the compile rules. It is determined by looking at the first \var{EXEOBJECTS} source file or the first \var{UNITOBJECTS} files. \item[PPLEXT] is the extension of shared library unit files. \item[PPUEXT] is the extension of default units. \item[SHAREDLIBEXT] is the extension of shared libraries. \item[SMARTEXT] is the extension of smartlinked unit assembler files. \item[STATICLIBEXT] is the extension of static libraries. \end{description} \subsection{Target files} The following variables are defined to make targets and rules easier: \begin{description} \item[COMPILER] is the complete compiler commandline, with all options added, after all \file{Makefile} variables have been examined. \item[DATESTR] contains the date. \item[EXEFILES] is a list of executables that will be created by the makefile. \item[EXEOFILES] is a list of executable object files that will be created by the makefile. \item[LOADEROFILES] is a list of object files that will be made from the loader assembler files. This is mainly for use in the compiler's RTL sources. \item[UNITPPUFILES] a list of unit files that will be made. This is just the list of unit objects, with the correct unit extension appended. \item[UNITOFILES] a list of unit object files that will be made. This is just the list of unit objects, with the correct object file extension appended. \end{description} \section{Rules and targets created by \file{fpcmake}} The \var{makefile.fpc} defines a series of targets, which can be called by your own targets. They have names that resemble default names (such as 'all', 'clean'), only they have \var{fpc\_} prepended. \subsection{Pattern rules} The makefile makes the following pattern rules: \begin{description} \item[units] how to make a pascal unit form a pascal source file. \item[executables] how to make an executable from a pascal source file. \item[object file] how to make an object file from an assembler file. \end{description} \subsection{Build rules} The following build targets are defined: \begin{description} \item[fpc\_all] target that builds all units and executables as well as loaders. If \var{DEFAULTUNITS} is defined, executables are excluded from the targets. \item[fpc\_exes] target to make all executables in \var{EXEOBJECTS}. \item[fpc\_loaders] target to make all files in \var{LOADEROBJECTS}. \item[fpc\_shared] target that makes all units as dynamic libraries. \item[fpc\_smart] target that makes all units as smartlinked units. \item[fpc\_units] target to make all units in \var{UNITOBJECTS}. \end{description} \subsection{Cleaning rules} The following cleaning targets are defined: \begin{description} \item[fpc\_clean] cleans all files that result when \var{fpc\_all} was made. \item[fpc\_cleanall] is the same as both previous target commands, but also deletes all object, unit and assembler files that are present. \end{description} \subsection{archiving rules} The following archiving targets are defined: \begin{description} \item[fpc\_zipinstall] will create an archive file (it's name is taken from \mvar{ZIPNAME}) from the compiled units. \item[fpc\_zipsourceinstall] will create an archive file (it's name is taken from \mvar{ZIPNAME}), from the sources. \end{description} The zip is made uzing the \var{ZIPEXE} program. Under \linux, a \file{.tar.gz} file is created. \subsection{Informative rules} The following targets produce information about the makefile: \begin{description} \item[fpc\_cfginfo] gives general configuration information: the location of the makefile, the compiler version, target OS, CPU. \item[fpc\_dirinfo] gives the directories, used by the compiler. \item[fpc\_info] executes all other info targets. \item[fpc\_installinfo] gives all directories where files will be installed. \item[fpc\_objectinfo] lists all objects that will be made. \item[fpc\_toolsinfo] lists all defined tools. \end{description} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Appendix F %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \chapter{Compiling the compiler yourself} \label{ch:AppF} \section{Introduction} The \fpc team releases at intervals a completely prepared package, with compiler and units all ready to use, the so-called releases. After a release, work on the compiler continues, bugs are fixed and features are added. The \fpc team doesn't make a new release whenever they change something in the compiler, instead the sources are available for anyone to use and compile. Compiled versions of RTL and compiler are also made daily, and put on the web. There are, nevertheless, circumstances when you'll want to compile the compiler yourself. For instance if you made changes to compiler code, or when you download the compiler via CVS. There are essentially 2 ways of recompiling the compiler: by hand, or using the makefiles. Each of these methods will be discussed. \section{Before you begin} To compile the compiler easily, it is best to keep the following directory structure (a base directory of \file{/pp/src} is supposed, but that may be different): \begin{verbatim} /pp/src/Makefile /makefile.fpc /rtl/linux /inc /i386 /... /compiler \end{verbatim} If you want to use the makefiles, you {\em must} use the above directory tree. The compiler and rtl source are zipped in such a way that if you unzip both files in the same directory (\file{/pp/src} in the above) the above directory tree results. The \file{makefile.fpc} and \file{Makefile} come from the \file{base.zip} file on the ftp site. If you compile manually, you don't need them. There are 2 ways to start compiling the compiler and RTL. Both ways must be used, depending on the situation. Usually, the RTL must be compiled first, before compiling the compiler, after which the compiler is compiled using the current compiler. In some special cases the compiler must be compiled first, with a previously compiled RTL. How to decide which should be compiled first? In general, the answer is that you should compile the RTL first. There are 2 exceptions to this rule: \begin{enumerate} \item The first case is when some of the internal routines in the RTL have changed, or if new internal routines appeared. Since the OLD compiler doesn't know about these changed internal routines, it will emit function calls that are based on the old compiled RTL, and hence are not correct. Either the result will not link, or the binary will give errors. \item The second case is when something is added to the RTL that the compiler needs to know about (a new default assembler mechanism, for example). \end{enumerate} How to know if one of these things has occurred ? There is no way to know, except by mailing the \fpc team. If you cannot recompile the compiler when you first compile the RTL, then try the other way. \section{Compiling using \file{make}} When compiling with \var{make} it is necessary to have the above directory structure. Compiling the compiler is achieved with the target \var{cycle}. Under normal circumstances, recompiling the compiler is limited to the following instructions (assuming you start in directory \file{/pp/src}): \begin{verbatim} cd compiler make cycle \end{verbatim} This will work only if the \file{makefile.fpc} is installed correctly and if the needed tools are present in the \var{PATH}. Which tools must be installed can be found in appendix \ref{ch:makefile}. The above instructions will do the following: \begin{enumerate} \item Using the current compiler, the RTL is compiled in the correct directory, which is determined by the OS you are under. e.g. under \linux, the RTL is compiled in directory \file{rtl/linux}. \item The compiler is compiled using the newly compiled RTL. If successful, the newly compiled compiler executable is copied to a temporary executable. \item Using the temporary executable from the previous step, the RTL is re-compiled. \item Using the temporary executable and the newly compiled RTL from the last step, the compiler is compiled again. \end{enumerate} The last two steps are repeated 3 times, until three passes have been made or until the generated compiler binary is equal to the binary it was compiled with. This process ensures that the compiler binary is correct. Compiling for another target: When you want to compile the compiler for another target, you must specify the \var{OS\_TARGET} makefile variable. It can be set to the following values: \var{win32}, \var{go32v2}, \var{os2} and \var{linux}. As an example, cross-compilation for the go32v2 target from the win32 target is chosen: \begin{verbatim} cd compiler make cycle OS_TARGET=go32v2 \end{verbatim} This will compile the go32v2 RTL, and compile a \var{go32v2} compiler. If you want to compile a new compiler, but you want the compiler to be compiled first using an existing compiled RTL, you should specify the \var{all} target, and specify another RTL directory than the default (which is the \file{../rtl/\$(OS\_TARGET)} directory). For instance, assuming that the compiled RTL units are in \var{/pp/rtl}, you could type \begin{verbatim} cd compiler make clean make all UNITDIR=/pp/rtl \end{verbatim} This will then compile the compiler using the RTL units in \file{/pp/rtl}. After this has been done, you can do the 'make cycle', starting with this compiler: \begin{verbatim} make cycle PP=./ppc386 \end{verbatim} This will do the \var{make cycle} from above, but will start with the compiler that was generated by the \var{make all} instruction. In all cases, many options can be passed to \var{make} to influence the compile process. In general, the makefiles add any needed compiler options to the command-line, so that the RTL and compiler can be compiled. You can specify additional options (e.g. optimization options) by passing them in \var{OPT}. \section{Compiling by hand} Compiling by hand is difficult and tedious, but can be done. We'll treat the compilation of RTL and compiler separately. \subsection{Compiling the RTL} To recompile the RTL, so a new compiler can be built, at least the following units must be built, in the order specified: \begin{enumerate} \item[loaders] the program stubs, that are the startup code for each pascal program. These files have the \file{.as} extension, because they are written in assembler. They must be assembled with the \gnu \file{as} assembler. These stubs are in the OS-dependent directory, except for \linux, where they are in a processor dependent subdirectory of the linux directory (\file{i386} or \file{m68k}). \item[system] the \file{system} unit. This unit is named differently on different systems: \begin{itemize} \item Only on GO32v2, it's called \file{system}. \item For \linux it's called \file{syslinux}. \item For \windowsnt it's called \file{syswin32}. \item For \ostwo it's called \file{sysos2} \end{itemize} This unit resides in the OS-dependent subdirectories of the RTL. \item[strings] The strings unit. This unit resides in the \file{inc} subdirectory of the RTL. \item[dos] The \file{dos} unit. It resides in the OS-dependent subdirectory of the RTL. Possibly other units will be compiled as a consequence of trying to compile this unit (e.g. on \linux, the \file{linux} unit will be compiled, on go32, the \file{go32} unit will be compiled). \item[objects] the objects unit. It resides in the \file{inc} subdirectory of the RTL. \end{enumerate} To compile these units on a i386, the following statements will do: \begin{verbatim} ppc386 -Tlinux -b- -Fi../inc -Fi../i386 -FE. -di386 -Us -Sg syslinux.pp ppc386 -Tlinux -b- -Fi../inc -Fi../i386 -FE. -di386 ../inc/strings.pp ppc386 -Tlinux -b- -Fi../inc -Fi../i386 -FE. -di386 dos.pp ppc386 -Tlinux -b- -Fi../inc -Fi../i386 -FE. -di386 ../inc/objects.pp \end{verbatim} These are the minimum command-line options, needed to compile the RTL. For another processor, you should change the \var{i386} into the appropriate processor. For another operating system (target) you should change the \file{syslinux} in the appropriate system unit file, and you should change the target OS setting (\var{-T}). Depending on the target OS there are other units that you may wish to compile, but which are not strictly needed to recompile the compiler. The following units are available for all plaforms: \begin{description} \item[objpas] Needed for Delphi mode. Needs \var{-S2} as an option. Resides in the \file{objpas} subdirectory. \item[sysutils] many utility functions, like in Delphi. Resides in the \file{objpas} directory, and needs \var{-S2} to compile. \item[typinfo] functions to access RTTI information, like Delphi. Resides in the \file{objpas} directory. \item[math] math functions like in Delphi. Resides in the \file{objpas} directory. \item[mmx] extensions for MMX class Intel processors. Resides in in the \file{i386} directory. \item[getopts] a GNU compatible getopts unit. resides in the \file{inc} directory. \item[heaptrc] to debug the heap. resides in the \file{inc} directory. \end{description} \subsection{Compiling the compiler} Compiling the compiler can be done with one statement. It's always best to remove all units from the compiler directory first, so something like \begin{verbatim} rm *.ppu *.o \end{verbatim} on \linux, and on \dos \begin{verbatim} del *.ppu del *.o \end{verbatim} After this, the compiler can be compiled with the following command-line: \begin{verbatim} ppc386 -Tlinux -Fu../rtl/linux -di386 -dGDB pp.pas \end{verbatim} So, the minimum options are: \begin{enumerate} \item The target OS. Can be skipped if you're compiling for the same target as the compiler you're using. \item A path to an RTL. Can be skipped if a correct ppc386.cfg configuration is on your system. If you want to compile with the RTL you compiled first, this should be \file{../rtl/OS} (replace the OS with the appropriate operating system subdirectory of the RTL). \item A define with the processor you're compiling for. Required. \item \var{-dGDB} is not strictly needed, but is better to add since otherwise you won't be able to compile with debug information. \item \var{-Sg} is needed, some parts of the compiler use \var{goto} statements (to be specific: the scanner). \end{enumerate} So the absolute minimal command line is \begin{verbatim} ppc386 -di386 -Sg pp.pas \end{verbatim} You can define some other command-line options, but the above are the minimum. A list of recognised options can be found in \seet{FPCdefines}. \begin{FPCltable}{ll}{Possible defines when compiling FPC}{FPCdefines} Define & does what \\ \hline USE\_RHIDE & Generates errors and warnings in a format recognized\\ & by \file{RHIDE}. \\ TP & Needed to compile the compiler with Turbo or Borland Pascal. \\ Delphi & Needed to compile the compiler with Delphi from Borland. \\ GDB & Support of the GNU Debugger. \\ I386 & Generate a compiler for the Intel i386+ processor family. \\ M68K & Generate a compiler for the M68000 processor family. \\ USEOVERLAY & Compiles a TP version which uses overlays. \\ EXTDEBUG & Some extra debug code is executed. \\ SUPPORT\_MMX & only i386: enables the compiler switch \var{MMX} which \\ &allows the compiler to generate MMX instructions.\\ EXTERN\_MSG & Don't compile the msgfiles in the compiler, always use \\ & external messagefiles (default for TP).\\ NOAG386INT & no Intel Assembler output.\\ NOAG386NSM & no NASM output.\\ NOAG386BIN & leaves out the binary writer.\\ \hline \end{FPCltable} This list may be subject to change, the source file \file{pp.pas} always contains an up-to-date list. \end{document}