XML utility routines with support for UTF-8 encoding.

laz2_xmlutils.pas contains XML utility routines with support for the UTF-8 encoding. It is copied from code in the FreePascal Free Component Library (FCL) and modified to provide UTF-8 support.

laz2_xmlutils.pas is part of LazUtils package.

TXMLUtilString is an alias to the AnsiString type. TXMLUtilChar is an alias to the Char type. PXMLUtilChar is an alias to the PChar type. PXMLUtilString is a Pointer to the TXMLUtilString type. Determines if the specified value contains a valid XML element name.

IsXmlName is an overloaded Boolean function used to determine if Value contains a valid XML element name. It ensures that the name is valid using the Name production in the XML specifications.

In XML 1.0, the valid characters allowed in a name was restricted. It used the principle of "that which is not expressly permitted is prohibited."

The Xml11 flag indicates if the XML 1.1 element naming conventions are allowed. When set to True, the XML 1.1 naming conventions are used. XML 1.1 allows almost all Unicode characters in any position in the name except the NameStart character. When to False (the default value), XML 1.0 naming conventions are used.

Extensible Markup Language (XML) 1.0 (Fifth Edition) Names and Tokens

Extensible Markup Language (XML) 1.1 (Second Edition) Names and Tokens

The overloaded variants allow Value to be specified as a TXMLUtilString (AnsiString) or a PXMLUtilChar (PChar) type.

The return value is True when Value is a valid XML element name for the indicated specification level.

True when the value is a valid XML element name. Value examined in the routine. True when XML 1.1 naming conventions used in the document. Number of characters in the Pointer. Determines if the specified value is a valid XML encoding name.

IsValidXmlEncoding is a Boolean function used to determine if the value specified in Value is a valid XML encoding name. The encoding name is used in the XML declaration for a document. For instance:

<?xml encoding='UTF-8'?>

or:

<?xml encoding='EUC-JP'?>
True the value satisfied the Name production in the XML specification. Value examined for a valid encoding identifier. Removes duplicate spaces in the specified value. Value updated to remove duplicate spaces in the routine. Determines if the specified value is an XML Whitespace character. True when the value is an XML Whitespace character. Value examined in the routine. Beware. Works in the ASCII character range only. A simple hash table with TXMLUtilString keys. Another hash table used to detect duplicate namespaced attributes without memory allocations. Override the prefix only. Override the prefix and emit a namespace definition.