+ Added Carls patches.

2025-08-14 08:09:18 +02:00 · 1998-07-22 13:51:21 +00:00 · 1998-07-22 13:51:21 +00:00 · 61285973f9
commit 61285973f9
parent 08bb4c9d4b
2 changed files with 685 additions and 361 deletions
--- a/docs/prog.tex
+++ b/docs/prog.tex
@ -31,8 +31,8 @@
 \begin{document}
 \title{Free Pascal \\ Programmers' manual}
 \docdescription{Programmers' manual for \fpc, version \fpcversion}
-\docversion{1.3}
+\docversion{1.4}
-\date{March 1998}
+\date{July 1998}
 \author{Micha\"el Van Canneyt}
 \maketitle
 \tableofcontents
@ -45,7 +45,7 @@
 This is the programmer's manual for \fpc.
 It describes some of the peculiarities of the \fpc compiler, and provides a
-glimp of how the compiler generates its code, and how you can change the
+glimpse of how the compiler generates its code, and how you can change the
 generated code. It will not, however, provide you with a detailed account of
 the inner workings of the compiler, nor will it tell you how to use the
 compiler (described in the \userref). It also will not describe the inner
@ -193,17 +193,17 @@ appears on your system.
 {\em Remark :} Take care that the object file you're linking is in a
 format the linker understands. Which format this is, depends on the platform
-you're on. Typing \var{ld} on th command line gives a list of formats
+you're on. Typing \var{ld} on the command line gives a list of formats
 \var{ld} knows about.
 You can pass other files and options to the linker using the \var{-k}
 command-line option. You can specify more than one of these options, and
-they
+they will be passed to the linker, in the order that you specified them on
-will be passed to the linker, in the order that you specified them on the
+the command line, just before the names of the object files that must be
-command line, just before the names of the object files that must be linked.
+linked.
 % Assembler type
-\subsection{\var{\$I386\_XXX} : Specify assembler format}
+\subsection{\var{\$I386\_XXX} : Specify assembler format (Intel x86 only)}
 This switch informs the compiler what kind of assembler it can expect in an
 \var{asm} block. The \var{XXX} should be replaced by one of the following:
 \begin{description}
@ -218,7 +218,7 @@ is compiled, unless they are replaced by another directive of the same type.
 The command-line switch that corresponds to this switch is \var{-R}.
-\subsection{\var{\$MMX} : MMX support}
+\subsection{\var{\$MMX} : MMX support (Intel x86 only)}
 As of version 0.9.8, \fpc supports optimization for the \textbf{MMX} Intel
 processor (see also \ref{ch:MMXSupport}). This optimizes certain code parts for the \textbf{MMX} Intel
 processor, thus greatly improving speed. The speed is noticed mostly when
@ -267,7 +267,7 @@ generated. You can specify this switch \textbf{only} befor the \var{Program}
 or \var{Unit} clause in your source file. The different kinds of formats are
 shown in \seet{Formats}.
-\begin{FPCltable}{ll}{Formats generated by the compiler}{Formats} \hline
+\begin{FPCltable}{ll}{Formats generated by the x86 compiler}{Formats} \hline
 Switch value & Generated format \\ \hline
 att  & AT\&T assembler file. \\
 o    & Unix object file.\\
@ -315,23 +315,42 @@ the executable. The effect of this switch is the same as the command-line
 switch \var{-g}. By default, insertion of debugging information is off.
 \subsection{\var{\$E} : Emulation of coprocessor}
 This directive controls the emulation of the coprocessor. On the i386
 processor, it is supported for
 compatibility with Turbo Pascal. The compiler itself doesn't do the emulation
 of the coprocessor. Under \dos, the \dos extender does this, and under
 \linux, the kernel takes care of the coprocessor support.
-If you use the Motorola 680x0 version, then the switch is recognized, as
+This directive controls the emulation of the coprocessor. There is no
-there is no extender to emulate the coprocessor, so the compiler must do
+command-line counterpart for this directive.
-that by itself.
+
 \subsubsection{ Intel x86 version }
 When this switch is enabled, all floating point instructions
 which are not supported by standard coprocessor emulators will give out
 a warning.
 The compiler itself doesn't do the emulation of the coprocessor.
 To use coprocessor emulation under \dos go32v1 there is nothing special
 required, as it is handled automatically.
 To use coprocessor emulation under \dos go32v2 you must use the
 emu387 unit, which contains correct initialization code for the
 emulator.
 Under \linux, the kernel takes care of the coprocessor support.
 \subsubsection{ Motorola 680x0 version }
 When the switch is on, no floating point opcodes are emitted
 by the code generator. Instead, internal run-time library routines
 are called to do the necessary calculations. In this case all
 real types are mapped to the single IEEE floating point type.
 \emph{ Remark : } By default, emulation is on. It is possible to
 intermix emulation code with real floating point opcodes, as
 long as the only type used is single or real.
 There is no command-line counterpart for this directive.
 \subsection{\var{\$G} : Generate 80286 code}
-This option is recognised for Turbo Pascal cmpatibility, but is ignored,
+This option is recognised for Turbo Pascal compatibility, but is ignored,
 because the compiler needs at least a 386 or higher class processor.
 \subsection{\var{\$L} : Local symbol information}
@ -348,15 +367,20 @@ mathematics.
 \subsection{\var{\$O} : Overlay code generation }
 This switch is recognised for Turbo Pascal compatibility, but is otherwise
-ignored, since the compiler requires a 386 or higher computer, with at 
+ignored.
 least 4 Mb. of ram.
 \subsection{\var{\$Q} : Overflow checking}
 The \var{\{\$Q+\}} directive turns on integer overflow checking.
 This means that the compiler inserts code to check for overflow when doing
 computations with an integer.
 When an overflow occurs, the run-time library will print a message
-\var{Overflow at xxx}, and exit the program with exit code 1.
+\var{Overflow at xxx}, and exit the program with exit code 215.
 \emph{ Remark: } Overflow checking behaviour is not the same as in
 Turbo Pascal since all arithmetic operations are done via 32-bit
 values. Furthermore, the Inc() and Dec() standard system procedures
 \emph{ are } checked for overflow in \fpc, while in Turbo Pascal they
 are not.
 Using the \var{\{\$Q-\}} switch switches off the overflow checking code
 generation.
@ -370,27 +394,26 @@ indices, enumeration types, subrange types, etc. Specifying the
 \var{\{\$R+\}} switch tells the computer to generate code to check these
 indices. If, at run-time, an index or enumeration type is specified that is
 out of the declared range of the compiler, then a run-time error is
-generated, and the program exits with exit code 1.
+generated, and the program exits with exit code 201.
 The \var{\{\$R-\}} switch tells the compiler not to generate range checking
 code. This may result in faulty program behaviour, but no run-time errors
 will be generated.
-{\em Remark: } this has not been implemented completely yet.
+{\em Remark: } Range checking for sets and enumerations are not yet fully
 implemented.
 \subsection{\var{\$S} : Stack checking}
 The \var{\{\$S+\}} directive tells the compiler to generate stack checking
 code. This generates code to check if a stack overflow occurred, i.e. to see
 whether the stack has grown beyond its maximally allowed size. If the stack
 grows beyond the maximum size, then a run-time error is generated, and the
-program will exit with exit code 1.
+program will exit with exit code 202.
 Specifying \var{\{\$S-\}} will turn generation of stack-checking code off.
-There is no command-line switch which is equivalent to this directive.
+The command-line compiler switch \var{-Ct} has the same effect as the
-
+\var{\{\$S+\}} directive.
 {\em Remark: } In principle, the stack is almost unlimited, 
 i.e. limited to the total free amount of memory on the computer.
 \subsection{\var{\$X} : Extended syntax}
@ -410,10 +433,10 @@ end;
 {$X-}
 Func (A);
 \end{verbatim}
-The reason this construct is supported is that
+The reason this construct is supported is that you may wish to call a
-you may wish to call a function for certain side-effects it has, but you
+function for certain side-effects it has, but you don't need the function
-don't need the function result. In this case you don't need to assign the
+result. In this case you don't need to assign the function result, saving
-function result, saving you an extra variable.
+you an extra variable.
 The command-line compiler switch \var{-Sa1} has the same effect as the
 \var{\{\$X+\}} directive.
@ -500,7 +523,7 @@ you should change \var{v} with the version number of the compiler
 you're using, \var{r} with the release number and \var{p}
 with the patch-number of the compiler. 'OS' needs to be changed by the type
 of operating system. Currently this can be one of \var{DOS}, \var{GO32V2},
-\var{LINUX}, \var{OS2} or \var{WIN32}. This symbol is undefined if you
+\var{LINUX}, \var{OS2}, \var{WIN32}, \var{MACOS}, \var{AMIGA} or \var{ATARI}. This symbol is undefined if you
 specify a target that is different from the platform you're compiling on.
 the \var{-TSomeOS} option on the command line will define the \var{SomeOS} symbol,
 and will undefined the existing platform symbol\footnote{In versions prior to
@ -809,7 +832,7 @@ need to compile with the \var{-Sm} command-line switch.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Using assembly language
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\chapter{Using assembly language}
+\chapter{Using Assembly language}
 \label{ch:AsmLang}
 \fpc supports inserting of assembler instructions in your code. The
 mechanism for this is the same as under Turbo Pascal. There are, however
@ -817,7 +840,7 @@ some substantial differences, as will be explained in the following.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Intel syntax
-\section{Intel syntax}
+\section{Intel syntax (Intel x86 only) }
 \label{se:Intel}
 As of version 0.9.7, \fpc supports Intel syntax in it's \var{asm} blocks.
@ -956,7 +979,7 @@ The Intel inline assembler supports the following macros :
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % AT&T syntax
-\section{AT\&T Syntax}
+\section{AT\&T Syntax (Intel x86 only) }
 \label{se:AttSyntax}
 \fpc uses the \gnu \var{as} assembler to generate its object files. Since
 the \gnu assembler uses AT\&T assembly syntax, the code you write should
@ -1057,57 +1080,33 @@ they are pushed {\em right} to {\em left}, instead of left to right for
 Turbo Pascal. This is especially important if you have some assembly
 subroutines in Turbo Pascal which you would like to translate to \fpc.
-Function results are returned in the first register, if they fit in the
+Function results are returned in the accumulator, if they fit in the
-register. For more information on this, see \sees{Stack}
+register.
 The registers are {\em not} saved when calling a function or procedure. If
 you want to call a procedure or function from assembly language, you must
 save any registers you wish to preserve.
 The first thing a procedure does is saving the base pointer, and setting the
-base (\var{\%ebp}) pointer equal to the stack pointer (\var{\%esp}). 
+base pointer equal to the stack pointer. References to the pushed parameters
-References to the pushed parameters and local variables are constructed 
+and local variables are constructed using the base pointer.
 using the base pointer.
-In practice this amounts to the following assembly code as the procedure or
+When the procedure or function exits, it clears the stack.
 function header :
 \begin{verbatim}
   pushl   %ebp
   movl    %esp,%ebp
 \end{verbatim}  
 When the procedure or function exits, it clears the stack by means of the
 \var{RET xx} call, where \var{xx} is the total size of the pushed parameters
 on the stack. Thus, in case parameters with a total size of \var{xx} have
 been passed to a function, the generated exit sequence looks as follows:
 \begin{verbatim}
  leave
  ret  $xx
 \end{verbatim}
 When you want your code to be called by a C library or used in a C
 program, you will run into trouble because of this calling mechanism. In C,
 the calling procedure is expected to clear the stack, not the called
-procedure. To avoid this problem, \fpc supports the \var{export} modifier.
+procedure. In other words, the arguments still are on the stack when the
-Procedures that are defined using the export modifier, use a C-compatible
+procedure exits. To avoid this problem, \fpc supports the \var{export}
-calling mechanism. This means that they can be called from a C program or
+modifier. Procedures that are defined using the export modifier, use a
-library, or that you can use them as a callback function.
+C-compatible calling mechanism. This means that they can be called from a
 C program or library, or that you can use them as a callback function.
 This also means that you cannot call this procedure or function from your
 own program, since your program uses the Pascal calling convention.
 However, in the exported function, you can of course call other Pascal
 routines.
 Technically, the C calling mechanism is implemented by generating the
 following exit sequence at the end of your function or procedure:
 \begin{verbatim}
  leave         {Copies EBP to ESP, pops EBP from the stack.}
  ret
 \end{verbatim}
 Comparing this exit sequence with the previous one makes it clear why you
 cannot call this procedure from within Pascal: The arguments still are on
 the stack when the procedure exits.
 As of version 0.9.8, the \fpc compiler supports also the \var{cdecl} and
 \var{stdcall} modifiers, as found in Delphi. The \var{cdecl} modifier does
 the same as the \var{export} modifier, and \var{stdcall} does nothing, since
@ -1136,6 +1135,54 @@ popstack & Right-to-left & Caller  & No \\ \hline
 More about this can be found in \seec{Linking} on linking.
 \subsection{ Intel x86 calling conventions }
 Standard entry code for procedures and functions is as follows on the
 x86 architecture:
 \begin{verbatim}
   pushl   %ebp
   movl    %esp,%ebp
 \end{verbatim}
 The generated exit sequence for procedure and functions looks as follows:
 \begin{verbatim}
  leave
  ret  $xx
 \end{verbatim}
 Where \var{xx} is the total size of the pushed parameters.
 To have more information on function return values take a look at the
 \seec{RegConvs} section.
 \subsection{ Motorola 680x0 calling conventions }
 Standard entry code for procedures and functions is as follows on the
 680x0 architecture:
 \begin{verbatim}
   move.l  a6,-(sp)
   move.l  sp,a6
 \end{verbatim}
 The generated exit sequence for procedure and functions looks as follows:
 \begin{verbatim}
  unlk   a6
  move.l (sp)+,a0     ; Get return address
  add.l  #xx,sp       ; Remove allocated stack
  move.l a0,-(sp)     ; Put back return address on top of the stack
 \end{verbatim}
 Where \var{xx} is the total size of the pushed parameters.
 To have more information on function return values take a look at the
 \seec{RegConvs} section.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Telling the compiler what registers have changed
 \section{Telling the compiler what registers have changed}
@ -1153,9 +1200,8 @@ asm
  ...
 end ['R1',...,'Rn'];
 \end{verbatim}
-Here \var{R1} to \var{Rn} are the names of the (extended) registers you 
+Here \var{R1} to \var{Rn} are the names of the 32-bit registers you
-modify in your assembly code. They can be one of \var{'EAX', 'EBX', 'ECX',
+modify in your assembly code.
 'EDX', 'EDI', 'ESI'} for the Intel processor.
 As an example :
 \begin{verbatim}
@ -1167,6 +1213,27 @@ As an example :
 \end{verbatim}
 This example tells the compiler that the \var{EAX} register was modified.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Register conventions
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \section{Register Conventions}
 \label{se:RegConvs}
 The compiler has different register conventions, depending on the
 target processor used.
 \subsection{ Intel x86 version }
 When optimizations are on, no register can be freely modified, without
 first being saved and then restored. Otherwise, EDI is usually used as
 a scratch register and can be freely used in assembler blocks.
 \subsection{ Motorola 680x0 version }
 Registers which can be freely modified without saving are registers
 D0, D1, D6, A0, A1, and floating point registers FP2 to FP7. All other
 registers are to be considered reserved and should be saved and then
 restored when used in assembler blocks.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Linking issues
@ -1542,14 +1609,14 @@ instructions on how to use and declare objects, see \refref.
 When using objects that need virtual methods, the compiler uses two help
 procedures that are in the run-time library. They are called
 \var{Help\_Destructor} and \var{Help\_Constructor}, and they are written in
-assebly language. They are used to allocate the necessary memory if needed,
+assembly language. They are used to allocate the necessary memory if needed,
 and to insert the Virtual Method Table (VMT) pointer in the newly allocated
 object.
 When the compiler encounters a call to an object's constructor,
 it sets up the stack frame for the call, and inserts a call to the
 \var{Help\_Constructor}
-procedure before issuing the call to the real constuctor. 
+procedure before issuing the call to the real constructor.
 The helper procedure allocates the needed memory (if needed) and inserts the
 VMT pointer in the object. After that, the real constructor is called.
@ -1690,7 +1757,7 @@ set up the stack. Then it calls the main program.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % MMX Support
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\chapter{MMX support}
+\chapter{MMX support (Intel x86 only) }
 \label{ch:MMXSupport}
 \section{What is it about ?}
@ -1846,7 +1913,7 @@ procedure.
 \label{se:ThirtytwoBit}
 The \fpc Pascal compiler issues 32-bit code. This has several consequences:
 \begin{itemize}
-\item You need a i386 or higher processor to run the generated code. The
+\item You need a 386 processor to run the generated code. The
 compiler functions on a 286 when you compile it using Turbo Pascal,
 but the generated programs cannot be assembled or executed.
 \item You don't need to bother with segment selectors. Memory can be
@ -1909,8 +1976,8 @@ pointer to \var{self} is pushed on the stack.
 \item If the procedure or function is nested in another function or
 procedure, then the frame pointer of the parent procedure is pushed on the
 stack.
-\item The return address is pushed on the stack (by the \var{Call}
+\item The return address is pushed on the stack (This is done automatically
-instruction).
+by the instruction which calls the subroutine).
 \end{enumerate}
 The resulting stack frame upon entering looks as in \seet{StackFrame}.
@ -1924,19 +1991,71 @@ Offset & What is stored & Optional ? \\ \hline
 +0 & Return address & No\\ \hline
 \end{FPCltable}
 \subsection{ Intel x86 version }
 The stack is cleared with the \var{ret} I386 instruction, meaning that the
 size of all pushed parameters is limited to 64K.
-The stack size is unlimited for all supported platforms. On the \var{GO32V2}
+\subsubsection{ DOS }
 platform, the minimum guaranteed stack is 128Kb, but this can be set with
 the \var{-Ctxxx} compiler switch. 
 Under the DOS targets , the default stack is set to 256Kb. This value
 cannot be modified for the GO32V1 target. But this can be modified
 with the GO32V2 target using a special DJGPP utility \var{stubedit}.
 It is to note that the stack size may be changed with some compiler
 switches, this stack size, if \emph{greater} then the default stack
 size will be used instead, otherwise the default stack size is used.
 \subsubsection{ Linux }
 Under Linux, stack size is only limited by the available memory by
 the system.
 \subsubsection{ OS/2 }
 Under OS/2, stack size is determined by one of the runtime
 environment variables set for EMX. Therefore, the stack size
 is user defined.
 \subsection{ Motorola 680x0 version }
 All depending on the processor target, the stack can be cleared in two
 manners, if the target processor is a MC68020 or higher, the stack will
 be cleared with a simple \var{rtd} instruction, meaning that the size
 of all pushed parameters is limited to 32K.
 Otherwise on MC68000/68010 processors, the stack clearing mechanism
 is sligthly more complicated, the exit code will look like this:
 \begin{verbatim}
 {
  move.l  (sp)+,a0
  add.l   paramsize,a0
  move.l  a0,-(sp)
  rts
 }
 \end{verbatim}
 \subsubsection{ Amiga }
 Under AmigaOS, stack size is determined by the user, which sets this
 value using the stack program. Typical sizes range from 4K to 40K.
 \subsubsection{ Atari }
 Under Atari TOS, stack size is currently limited to 8K, and it cannot
 be modified. This may change in a future release of the compiler.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % The heap
 \section{The heap}
 \label{se:Heap}
 The heap is used to store all dynamic variables, and to store class
 instances. The interface to the heap is the same as in Turbo Pascal,
 although the effects are maybe not the same. On top of that, the \fpc
 run-time library has some extra possibilities, not available in Turbo
 Pascal. These extra possibilities are explained in the next subsections.
 % The heap grows
 \subsection{The heap grows}
 \fpc supports the \var{HeapEerror} procedural variable. If this variable is
@ -2064,7 +2183,7 @@ ReleaseTempHeap;
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Accessing DOS memory under the GO32 extender
-\section{Accessing \dos memory under the Go32 extender}
+\section{Accessing \dos memory under the Go32 extender (Intel x86 only) }
 \label{se:AccessingDosMemory}
 Because \fpc is a 32 bit compiler, and uses a \dos extender, accessing DOS
@ -2118,6 +2237,317 @@ After using the selector, you must free it again using the
 More information on all this can be found in the \unitsref, the chapter on
 the \file{GO32} unit.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Optimizations done in the compiler
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \chapter{Optimizations}
 \section{ Non processor specific }
 The following sections describe the general optimizations
 done by the compiler, they are non processor specific. Some
 of these require some compiler switch override while others are done
 automatically (those which require a switch will be noted as such).
 \subsection{ Constant folding }
 In \fpc, if the operand(s) of an operator are constants, they
 will be evaluated at compile time.
 Example
 \begin{verbatim}
   x:=1+2+3+6+5;
 will generate the same code as
   x:=17;
 \end{verbatim}
 Furthermore, if an array index is a constant, the offset will
 be evaluated at compile time. This means that accessing MyData[5]
 is as efficient as accessing a normal variable.
 Finally, calling \var{Chr}, \var{Hi}, \var{Lo}, \var{Ord}, \var{Pred},
 or \var{Succ} functions with constant parameters generates no
 run-time library calls, instead, the values are evaluated at
 compile time.
 \subsection{ Constant merging }
 Using the same constant string two or more times generates only
 one copy of the string constant.
 \subsection{ Short cut evaluation }
 Evaluation of boolean expression stops as soon as the result is
 known, which makes code execute faster then if all boolean operands
 were evaluted.
 \subsection{ Constant set inlining }
 Using the \var{in} operator is always more efficient then using the
 equivalent <>, =, <=, >=, < and > operators. This is because
 range comparisons can be done more easily with \var{in} then with
 normal comparison operators.
 \subsection{ Small sets }
 Sets which contain less then 33 elements can be directly encoded
 using a 32-bit value, therefore no run-time library calls to
 evaluate operands on these sets are required; they are directly encoded
 by the code generator.
 \subsection{ Range checking }
 Assignments of constants to variables are range checked at compile
 time, which removes the need the generation of runtime range checking
 code.
 \emph{Remark:} This feature was not implemented before version
 0.99.5 of \fpc.
 \subsection{ Shifts instead of multiply or divide }
 When one of the operands in a multiplication is a power of
 two, they are encoded using arithmetic shifts instructions,
 which generates more efficient code.
 Similarly, if the divisor in a \var{div} operation is a power
 of two, it is encoded using arithmetic shifts instructions.
 The same is true when accessing array indexes which are
 powers of two, the address is calculated using arithmetic
 shifts instead of the multiply instruction.
 \subsection{ Automatic alignment }
 By default all variables larger then a byte are guaranteed to be aligned
 at least on a word boundary.
 Furthermore all pointers allocated using the standard runtime
 library (\var{New} and \var{GetMem} among others) are guaranteed
 to return pointers aligned on a quadword boundary (64-bit alignment).
 Alignment of variables on the stack depends on the target processor.
 \emph{ Remark: } Quadword alignment of pointers is not guaranteed
 on systems which don't use an internal heap, such as for the Win32
 target.
 \emph{ Remark: } Alignment is also done \emph{between} fields in
 records, objects and classes, this is \emph{not} the same as
 in Turbo Pascal and may cause problems when using disk I/O with these
 types. To get no alignment between fields use the \var{packed} directive
 or the \var{\{\$PackRecords n\}} switch. For further information, take a
 look at the reference manual under the \var{record} heading.
 \subsection{ Smart linking }
 This feature removes all unreferenced code in the final executable
 file, making the executable file much smaller.
 \emph{ Remark: } Smart linking was implemented starting with
 version 0.99.6 of \fpc.
 \subsection{ Inline routines }
 The following runtime library routines are coded directly into the
 final executable : \var{Lo}, \var{Hi}, \var{High}, \var{Sizeof},
 \var{TypeOf}, \var{Length}, \var{Pred}, \var{Succ}, \var{Inc},
 \var{Dec} and \var{Assigned}.
 \emph{ Remark: } Inline \var{Inc} and \var{Dec} were not completely
 implemented until version 0.99.6 of \fpc.
 \subsection{ Case optimization }
 When using the \var{-Oa} switch, case statements in certain cases will
 be decoded using a jump table, which in certain cases will make the
 case statement execute faster.
 \subsection{ Stack frame omission }
 When using the \var{-Ox} switch, under certain specific conditions,
 the stack frame (entry and exit code for the routine) will be omitted, and
 the variable will directly be accessed via the stack pointer.
 Conditions for omission of the stack frame :
 \begin{itemize}
 \item Routine does not call other routines
 \item Routine does not contain assembler statements
 \item Routine is not declared using the \var{Interrupt} directive
 \item Routine is not a constructor or destructor
 \end{itemize}
 \subsection{ Register variables }
 When using the \var{-Ox} switch, local variables or parameters
 which are used very often will be moved to registers for faster
 access.
 \emph{ Remark: } Register variable allocation is currently
 broken and should not be used.
 \subsection{ Intel x86 specific }
 Here follows a listing of the opimizing techniques used in the compiler:
 \begin{enumerate}
 \item When optimizing for a specific Processor (\var{-O3, -O4, -O5 -O6},
 the following is done:
 \begin{itemize}
 \item In \var{case} statements, a check is done whether a jump table
 or a sequence of conditional jumps should be used for optimal performance.
 \item Determines a number of strategies when doing peephole optimization:
 \var{movzbl (\%ebp), \%eax} on PentiumPro and PII systems will be changed
 into \var{xorl \%eax,\%eax; movb (\%ebp),\%al } for lesser systems.
 \end{itemize}
 Cyrix \var{6x86} processor owners should optimize with \var{-O4} instead of
 \var{-O5}, because \var{-O5} leads to larger code, and thus to smaller
 speed, according to the Cyrix developers FAQ.
  \item When optimizing for speed (\var{-OG}) or size (\var{-Og}), a choice is
 made between using shorter instructions (for size) such as \var{enter \$4},
 or longer instructions \var{subl \$4,\%esp} for speed. When smaller size is
 requested, things aren't aligned on 4-byte boundaries.  When speed is
 requested, things are aligned on 4-byte boundaries as much as possible.
 \item Simple optimization (\var{-Oa}) makes sure the peephole optimizer is
 used, as well as the reloading optimizer.
 \item Uncertain optimizations (\var{-Oz}): With this switch, the reloading
 optimizer (enabled with \var{-Oa}) can be forced into making uncertain
 optimizations.
 You can enable uncertain optimizations only in certain cases,
 otherwise you will produce a bug; the following technical description
 tells you when to use them:
 \begin{quote}
 % Jonas's own words..
 \em
 If uncertain optimizations are enabled, the reloading optimizer assumes
 that
 \begin{itemize}
 \item If something is written to a local/global register or a
 procedure/function parameter, this value doesn't overwrite the value to
 which a pointer points.
 \item If something is written to memory pointed to by a pointer variable,
 this value doesn't overwrite the value of a local/global variable or a
 procedure/function parameter.
 \end{itemize}
 % end of quote
 \end{quote}
 The practical upshot of this is that you cannot use the uncertain
 optimizations if you access any local or global variables through pointers. In
 theory, this includes \var{Var} parameters, but it is all right
 if you don't both read the variable once through its \var{Var} reference
 and then read it using it's name.
 The following example will produce bad code when you switch on
 uncertain optimizations:
 \begin{verbatim}
 Var temp: Longint;
 Procedure Foo(Var Bar: Longint);
 Begin
  If (Bar = temp)
    Then
      Begin
        Inc(Bar);
        If (Bar <> temp) then Writeln('bug!')
      End
 End;
 Begin
  Foo(Temp);
 End.
 \end{verbatim}
 The reason it produces bad code is because you access the global variable
 \var{Temp} both through its name \var{Temp} and through a pointer, in this
 case using the \var{Bar} variable parameter, which is nothing but a pointer
 to \var{Temp} in the above code.
 On the other hand, you can use the uncertain optimizations if
 you access global/local variables or parameters through pointers,
 and {\em only} access them through this pointer\footnote{
 You can use multiple pointers to point to the same variable as well, that
 doesn't matter.}.
 For example:
 \begin{verbatim}
 Type TMyRec = Record
                a, b: Longint;
              End;
     PMyRec = ^TMyRec;
     TMyRecArray = Array [1..100000] of TMyRec;
     PMyRecArray = ^TMyRecArray;
 Var MyRecArrayPtr: PMyRecArray;
    MyRecPtr: PMyRec;
    Counter: Longint;
 Begin
  New(MyRecArrayPtr);
  For Counter := 1 to 100000 Do
    Begin
       MyRecPtr := @MyRecArrayPtr^[Counter];
       MyRecPtr^.a := Counter;
       MyRecPtr^.b := Counter div 2;
    End;
 End.
 \end{verbatim}
 Will produce correct code, because the global variable \var{MyRecArrayPtr}
 is not accessed directly, but through a pointer (\var{MyRecPtr} in this
 case).
 In conclusion, one could say that you can use uncertain optimizations {\em
 only} when you know what you're doing.
 \end{enumerate}
 \subsection{ Motorola 680x0 specific }
 Using the \var{-O2} switch does several optimizations in the
 code produced, the most notable being:
 \begin{itemize}
 \item Sign extension from byte to long will use \var{EXTB}
 \item Returning of functions will use \var{RTD}
 \item Range checking will generate no run-time calls
 \item Multiplication will use the long \var{MULS} instruction, no
 runtime library call will be generated
 \item Division will use the long \var{DIVS} instruction, no
 runtime library call will be generated
 \end{itemize}
 \section{ Floating point }
 This is where can be found processor specific information on Floating
 point code generated by the compiler.
 \subsection{ Intel x86 specific }
 All normal floating point types map to their real type, including
 \var{comp} and \var{extended}.
 \subsection{ Motorola 680x0 specific }
 Early generations of the Motorola 680x0 processors did not have integrated
 floating point units, so to circumvent this fact, all floating point
 operations are emulated (when the \var{\$E+} switch ,which is the default)
 using the IEEE \var{Single} floating point type. In other words when
 emulation is on, Real, Single, Double and Extended all map to the
 \var{single} floating point type.
 When the \var{\$E} switch is turned off, normal 68882/68881/68040
 floating point opcodes are emitted. The Real type still maps to
 \var{Single} but the other types map to their true floating point
 types. Only basic FPU opcodes are used, which means that it can
 work on 68040 processors correctly.
 \emph{ Remark: } \var{Double} and \var{Extended} types in true floating
 point mode have not been extensively tested as of version 0.99.5.
 \emph{ Remark: } The \var{comp} data type is currently not supported.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Appendices
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -2281,130 +2711,6 @@ changed by changing the \var{bytearray1} type in \file{cobjects.pas}
 compiler. When using the 32-bit compiler, the limit is set to 1024. You can
 change this by redefining the \var{maxunits} constant in the
 \file{files.pas} compiler source file.
 \item Procedures or functions accept parameters with a total size up to 
 \var{\$ffff} bytes. This limit is due to the \var{RET} instruction of the I386
 processor. If the calls were made using the C convention this limit would
 disappear.
 \end{enumerate}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Appendix D
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \chapter{Optimizing techniques used in the compiler.}
 Here follows a listing of the opimizing techniques used in the compiler:
 \begin{enumerate}
 \item When optimizing for a specific Processor (\var{-O3, -O4, -O5 -O6}, 
 the following is done:
 \begin{itemize}
 \item In \var{case} statements, a check is done whether a jump table
 or a sequence of conditional jumps should be used for optimal performance.
 \item Determines a number of strategies when doing peephole optimization:
 \var{movzbl (\%ebp), \%eax} on PentiumPro and PII systems will be changed
 into \var{xorl \%eax,\%eax; movb (\%ebp),\%al } for lesser systems.
 \end{itemize}
 Cyrix \var{6x86} processor owners should optimize with \var{-O4} instead of
 \var{-O5}, because \var{-O5} leads to larger code, and thus to smaller
 speed, according to the Cyrix developers FAQ.
  \item When optimizing for speed (\var{-OG}) or size (\var{-Og}), a choice is
 made between using shorter instructions (for size) such as \var{enter \$4},
 or longer instructions \var{subl \$4,\%esp} for speed. When smaller size is
 requested, things aren't aligned on 4-byte boundaries.  When speed is
 requested, things are aligned on 4-byte boundaries as much as possible.
 \item Simple optimization (\var{-Oa}) makes sure the peephole optimizer is
 used, as well as the reloading optimizer.
 \item Maximum optimization (\var{-Ox}) avoids creation of stack frames if
 they aren't required, and unnecessary loading of registers is avoided as
 much as possible. (buggy at the moment (version 0.99.0).
 \item Uncertain optimizations (\var{-Oz}): With this switch, the reloading 
 optimizer (enabled with \var{-Oa}) can be forced into making uncertain 
 optimizations.
 You can enable uncertain optimizations only in certain cases, 
 otherwise you will produce a bug; the following technical description 
 tells you when to use them:
 \begin{quote}
 % Jonas's own words..
 \em
 If uncertain optimizations are enabled, the reloading optimizer assumes 
 that
 \begin{itemize}
 \item If something is written to a local/global register or a 
 procedure/function parameter, this value doesn't overwrite the value to 
 which a pointer points.
 \item If something is written to memory pointed to by a pointer variable, 
 this value doesn't overwrite the value of a local/global variable or a 
 procedure/function parameter.
 \end{itemize}
 % end of quote
 \end{quote}
 The practical upshot of this is that you cannot use the uncertain
 optimizations if you access any local or global variables through pointers. In 
 theory, this includes \var{Var} parameters, but it is all right 
 if you don't both read the variable once through its \var{Var} reference 
 and then read it using it's name. 
 The following example will produce bad code when you switch on
 uncertain optimizations:
 \begin{verbatim}
 Var temp: Longint;
 Procedure Foo(Var Bar: Longint);
 Begin
  If (Bar = temp)
    Then
      Begin
        Inc(Bar);
        If (Bar <> temp) then Writeln('bug!')
      End
 End;
 Begin
  Foo(Temp);
 End.
 \end{verbatim}
 The reason it produces bad code is because you access the global variable
 \var{Temp} both through its name \var{Temp} and through a pointer, in this
 case using the \var{Bar} variable parameter, which is nothing but a pointer
 to \var{Temp} in the above code.
 On the other hand, you can use the uncertain optimizations if
 you access global/local variables or parameters through pointers, 
 and {\em only} access them through this pointer\footnote{
 You can use multiple pointers to point to the same variable as well, that 
 doesn't matter.}.
 For example:
 \begin{verbatim}
 Type TMyRec = Record
                a, b: Longint;
              End;
     PMyRec = ^TMyRec;
     TMyRecArray = Array [1..100000] of TMyRec;
     PMyRecArray = ^TMyRecArray;
 Var MyRecArrayPtr: PMyRecArray;
    MyRecPtr: PMyRec;
    Counter: Longint;
 Begin
  New(MyRecArrayPtr);
  For Counter := 1 to 100000 Do
    Begin
       MyRecPtr := @MyRecArrayPtr^[Counter];
       MyRecPtr^.a := Counter;
       MyRecPtr^.b := Counter div 2;
    End;
 End.
 \end{verbatim}
 Will produce correct code, because the global variable \var{MyRecArrayPtr}
 is not accessed directly, but through a pointer (\var{MyRecPtr} in this
 case). 
 In conclusion, one could say that you can use uncertain optimizations {\em
 only} when you know what you're doing.
 \end{enumerate} 
 \end{document}
--- a/docs/ref.tex
+++ b/docs/ref.tex
@ -114,8 +114,8 @@ percent sign (\var{\%}). Thus, \var{255} can be specified in binary notation
 as \var{\%11111111}.
 \subsection{Real types}
-\fpc uses the math coprocessor (or an emulation) for al its floating-point 
+\fpc uses the math coprocessor (or an emulation) for all its floating-point
-calculations. The Real native type for is processor dependant,
+calculations. The Real native type is processor dependant,
 but it is either Single or Double. Only the IEEE floating point type are
 supported, and these depend on the target processor and emulation options.
 The true Turbo Pascal compatible types are listed in
@ -812,21 +812,6 @@ command-line switch.
 {\em Remark:} These constructions are just for typing convenience, they
 don't generate different code.
 \fpc also supports typed assignments. This means that an assignment
 statement has a definite type, and hence can be assigned to another
 variable. The type of the assignment \var{a:=b} is the type of \var{a}
 (or, in this case, of \var{b}), and this can be assigned to another
 variable : \var{c:=a:=b;}.
 To summarize: the construct
 \begin{verbatim}
 a:=b:=c;
 \end{verbatim}
 results in both \var{a} and \var{b} being assign the value of \var{c}, which
 may be an expression.
 For this construct to be allowed, it is necessary to specify the \var{-Sa4}
 switch on the command line.
 \subsection{The \var{Case} statement}
 \fpc supports the \var{case} statement. Its prototype is
 \begin{verbatim}
@ -968,7 +953,11 @@ Be aware of the fact that the boolean expressions \var{Expression1} and
 will be stopped at the point where the outcome is known with certainty)
 \subsection{The \var{With} statement}
-The with statement serves to access the elements of a record, without
+
 The with statement serves to access the elements of a record\footnote{
 The \var{with} statement does not work correctly when used with 
 objects or classes until version 0.99.6}
 , without
 having to specify the name of the record. Given the declaration:
 \begin{verbatim}
 Type Passenger = Record
@ -1063,10 +1052,12 @@ are also declared with open arrays as parameters, {\em not} to functions or
 procedures which accept arrays of fixed length.
 \section{Using assembler in your code}
 \fpc supports the use of assembler in your code, but not inline
-assembler macros. Assembly functions (i.e. functions declared with the
+assembler macros.  To have more information on the processor
-\var{Assembler} keyword) are supported as of version 0.9.7. (see
+specific assembler syntax and its limitations, see the \progref.
-\progref for more information about this).
+
 \subsection{ Assembler statements }
 The following is an example of assembler inclusion in your code.
 \begin{verbatim}
@ -1090,6 +1081,34 @@ recognise it, and treat it as any other conditionals.
 \emph{ Remark: } Before version 0.99.1, \fpc did not support
 reference to variables by their names in the assembler parts of your code.
 \subsection{ Assembler procedures and functions }
 Assembler procedures and functions are declared using the
 \var{Assembler} directive. The \var{Assembler} keyword is supported
 as of version 0.9.7. This permits the code generator to make a number
 of code generation optimizations.
 The code generator does not generate any stack frame (entry and exit
 code for the routine) if it contains no local variables. In the case
 of functions, ordinal values must be returned in the accumulator. In
 the case of floating point values, these depend on the target processor
 and emulation options.
 \emph{ Remark: } Before version 0.99.1, \fpc did not support
 reference to variables by their names in the assembler parts of your code.
 \emph{ Remark: } Currently, the \var{Assembler} directive has not the
 same effect as in Turbo Pascal, so beware! In \fpc, parameters are
 treated normally, which is not the case in Turbo Pascal. Furthermore,
 the stack frame will be omitted if there are no local variables, in this
 case if the assembly routine has any parameters, they will be referenced
 directly via the stack pointer. This is \em{ NOT} like Turbo Pascal where
 the stack frame is only omitted if there are no parameters \em{ and } no
 local variables. Therefore, if your assembly routines will modify the stack
 pointer, such as when pushing or popping values on the stack, the
 \var{Assembler} keyword should not be used. Instead, use a normal procedure
 with \var{Asm} blocks.
 \section{Modifiers}
 \fpc doesn't support all Turbo Pascal modifiers, but
 does support a number of additional modifiers. They are used mainly for assembler and
@ -1207,7 +1226,6 @@ function must be exactly the same.
 The \var{external} modifier has also an extended syntax:
 \begin{enumerate}
 \item
 \begin{verbatim}
 external 'lname';
 \end{verbatim}