Contains routines used for string manipulation

LazStringUtils.pas contains routines used for string manipulation. It is part of the LazUtils package.

File added in LCL version 2.0.X (revision 58631). Represents comment styles available in CommentText Auto-detected comment style No comment Comment surrounded by { } characters Delphi inline comment using a // marker TurboPascal comment using (* *) markers C++ comment using /* */ markers Perl comment using a # marker HTML/XML comment using <!-- --> markers Set type for TCommentType enumeration values End-of-line character sequence

EndOfLine is a ShortString constant used to represent the end-of-line character sequence. The value is set to the LineEnding for the platform or operating system.

True when characters in S are in the range '0'..'9' True when characters in S are in the range '0'..'9' String with values examined in the routine Gets the number of LineEnding sequences in the specified text Number of LineEnding sequences in the text Text examined in the routine Number of characters in the last line Converts CR and LF characters in s string to the specified line-ending sequence

ChangeLineEndings is a String function used to convert CR or LF characters in a string to the specified line-ending character sequence. ChangeLineEndings assumes that values in S are single-byte values and not multi-byte UTF-8 characters.

ChangeLineEndings iterates over the byte values in S, and replaces any occurrences of CR (#13), LF (#10), or CR+LF (#13#10) to the specified end-of-line character sequence. No actions are performed in the function when S is an empty string ('').

Value from S after conversion of line endings Values examined in the routine End-of-line character sequence applied to the return value Normalizes end-of-line characters in a string to the value in LineEnding

Calls ChangeLineEndings to normalize line ending sequences to the value in LineEnding.

Value after line ending sequences have been replaced String with values examined and updated in the routine Converts CR or LF characters in a string to the specified delimiter character Value after converting CR, LF to the delimiter character String with values converted in the routine Delimiter character used in place of CR, LF characters Deprecated. Use LineBreaksToSystemLineBreaks instead. Converts all Tab characters in a string to the specified number of space characters

Replaces all Tab characters (#9) in S with the specified number of space characters.

String value after converting Tab characters to spaces String with values updated in the routine Number of space characters to use for each Tab character True when UTF-8 codepoints are used to convert individual characters Converts a string into a comment using the specified comment style

CommentText is a String function used to convert the specified string into a comment using a given comment style.

CommentType is a TCommentType value which indicates the comment style applied to the value in S. See TCommentType for more information about the enumeration values and their meanings.

No actions are performed in the function when CommentType contains comtNone.

When CommentType contains comtDefault, the comtPascal comment style is applied to the value in S.

An internal procedure is used to apply the starting marker (and ending marker when needed) as well as a continuation character sequence for a multi-line comment.

CommentText is used in the implementation of the source code editor in the Lazarus IDE. An Exception can be raised if the comment length does not match the expected length for the comment style.

Raises an Exception with the message 'IDEProcs.CommentText ERROR: ' when an unexpected length is found for the commented value.
Comment with the specified style marker(s) Value converted into a comment Comment type applied to the string Creates a regular expression from a filter expression used in IDE dialogs

Used in the implementation of the Clean Directory and Change Encoding dialogs in the Lazarus IDE.

?
Replaces control characters with Pascal-style character constants

Replaces control characters (#0..#31) with Pascal-style character constants using the #nnn notation.

String after replacing binary characters Values examined and converted in the routine Converts occurrences of special characters to spaces

Converts special characters (#0..#31, #127) to a Space character. Converts line breaks to a single Space character (#32). Trims leading and trailing Spaces. Calls UTF8FixBroken when FixUTF8 is True.

String after converting special characters String examined and converted in the routine True when invalid UTF-8 codepoints are repaired in the string Converts special characters to Pascal-style hexadecimal character constants

SpecialCharsToHex is a String function used to convert special or control characters (those prior to the Space character Decimal 32) to their representation as a Pascal-style hexadecimal character constant using notation like #$1B. Other character values in s are not altered in any way.

SpecialCharsToHex calls Utf8EscapeControlChars in LazUtf8 to get the return value for the function. The value emHexPascal is passed as an argument to request hexadecimal notation in the special characters.

SpecialCharsToHex is used in the implementation of the Search Result view in the Lazarus IDE.

SpecialCharsToHex has been deprecated in Lazarus 2.1, and will be removed in Lazarus 2.3. Use the Utf8EscapeControlChars routine in LazUtf8.pas instead.
String after converting special characters to Pascal-style constants Values examined and converted in the routine Shortens and "ellipsifies" the specified value

ShortDotsLine is a String function used to generate a shortened and "ellipsified" string for the value in Line.

ShortDotsLine calls SpecialCharsToHex to convert any control characters in Line to their representation as a hexadecimal character value.

The value in the MaxTextLen constant is used as the maximum length for the "ellipsified" string value. If the number of UTF-8 codepoints in the line is larger than the value in MaxTextLen, the string is shortened to the maximum length and 3 (three) Period ('.') characters are appended to the return value.

Shortened and "ellipsified" value for the specified line of text Line of text examined in the routine Combines and optionally shortens the specified values

BeautifyLineXY is a String function used to combine the values in the Filename, Line, X and Y arguments into a formatted message. The message is in the form:

LazStringUtils.pas (742, 1) Invalid UTF-8 codepoint found in the specified argument(s).

Filename contains a file name used at the start of the formatted message.

X represents the line number in the context for the message.

Y represents the column number in the context for the message.

Line contains the context for the formatted message. The ShortDotsLine routine is called to shorten and "ellipsify" the message in Line when needed.

BeautifyLineXY is used in the implementation of Jump History and Search Result views in the Lazarus IDE.

Formatted message using the specified values File name used at the start of the formatted message Contains the context for the formatted message Line number for the message context Column number for the message context Applies line breaks and indenting to a string value

BreakString is a String function used to apply line breaks and indent spacing to the text specified in S.

MaxLineLength contains the maximum number of characters allowed on any given line.

Indent contains the number of space characters used to indent text following a line break. The value in Indent may be adjusted if it is too large for the value specified in MaxLineLength.

BreakString examines values in S and counts the number of characters in each of the lines. Existing CR (#13) or LF (#10) characters are preserved. If the value in MaxLineLength is reached for any given line, a new line is created by inserting the value in LineEnding. The line break occurs at a natural word boundary when one can be determined.

Inserting a line break causes an indent with the number of space characters in Indent to be inserted in the return value following the line break.

The process is repeated until all values in S have been handled.

The return value contains the content in S after applying line breaks and indent spacing.

String with values after applying line breaks and indent spacing Contains the text examined and formatted in the routine Maximum length of lines in the converted value Number of Space characters prepended as an indent for lines in the converted value Creates and populates a TStringList with lines determined using the specified delimiter character TStrings instance created and populated in the routine String with the values examined and loaded into the string list Character used to delimit lines of text in S TStrings instance where lines of text are stored True to clear the string list; False to append lines to existing values Gets a string with the lines of text from the specified TStrings instance String with the delimited lines of text from the TStrings instance TStrings instance with text values retrieved in the routine End-of-Line sequence used to delimit lines of text in the result value True to omit empty lines in the string list from the return value Converts the specified lines in a TStrings instance to a string value

StringListPartToText is a String function used to get a line of text which contains the specified range of lines from a TStrings instance.

List is the TStrings instance which contains the lines of text examined in the function.

FromIndex and ToIndex indicate the line numbers in List used in the return value for the function. They must contain valid ordinal positions, and are used to access the indexed Strings property in the TStrings instance.

If FromIndex contains -1, it defaults to the first ordinal position (0). ToIndex must be equal to or larger than the value in FromIndex, and valid for the number of Strings in the string list. It defaults to the upper limit for the string list when it is too large. FromIndex cannot have a value that is larger than the one in ToIndex.

No actions are performed in the function when List is unassigned (contains Nil), or when values in the FromIndex or ToIndex parameters are invalid. The return value is an empty string ('') in these scenarios.

IgnoreEmptyLines indicates whether empty lines in the string list are omitted from the return value for the function. When set to True, any Strings value that is an empty string ('') is discarded. Otherwise, the empty value is denoted by adding the value in Delimiter to the return value.

Delimiter contains the end-of-line sequence used to separate strings added to the return value.

Text representing the specified lines in the TStrings instance TStrings instance examined in the method First line includes in the text Last line included in the text Delimiter inserted between lines in the text Indicates if empty lines are excluded from the text Converts the content in a TStrings instance to a string value

Adds a LF (#10) character to the end of text lines in the string list. Quotes each string value which ends with a LF character using surrounding Quote (') characters.

String representing the contents of the TStrings instance TStrings instance examined in the method First line included in the string value Last line included in the string value Indicates if empty lines are excluded from the result Stores the specified string as lines in a TStrings instance

LF (#10) characters found in s control when a line of text is added in the TStrings instance. The end-of-line character is not stored in the TStrings instance. If no end-of-line characters are found in S, then a single line of text is added to the string list.

String with values extracted and stored in the method TStrings instance where values are stored in the method Gets the next delimited value in List starting at the specified position

GetNextDelimitedItem is a String function used to get the next item in a delimited list of items starting at the specified position.

List contains the list of values examined in the routine.

Delimiter is the character used to separate item values in List.

Position contains the initial character position in List examined in the routine.

GetNextDelimitedItem iterates over the characters in List starting at the character in Position. When the character in Delimiter is encountered, the characters starting at Position and prior to the position for the Delimiter are copied into the return value.

The value in Position is incremented to skip both the character values and the delimiter for the list item.

Value for the list item at the specified position List of delimited values examined in the routine Delimiter character used to separate values in the list Initial position in the list where the characters are examined Determines if a value exists in a delimited list of values

HasDelimitedItem is a Boolean function used to determine if the specified value exists in a delimited list of values.

List contains the item values examined in the routine.

Delimiter is the character used to separate the item values in List.

FindItem contains the value to locate in the List of items.

HasDelimitedItem calls FindNextDelimitedItem to get the return value for the method. The return value is True when FindNextDelimitedItem returns an non-empty string value.

True when the specified item is found in the list Values checked for the specified item Delimiter used to separate items in the list Value to locate in the list of items Combines two string values using the specified delimiter character

The value in Delimiter is omitted from the return value if either A or B is an empty string ('').

Combined values using the specified delimiter First value merged in the result Second value merged after the delimiter Delimiter used to separate values Gets the first line of text up to an end-of-line character

StripLn is a String function used to get the first line of text in the specified value up to an end-of-line character. CR and LF characters are recognized as end-of-line characters.

The return value contains the values from ALine prior to the first end-of-line character, or the entire contents of ALine when an end-of-line character is not found.

The value in ALine is a constant parameter and is not altered in any way in the routine.
Text in the initial line of text Values examined in the routine GetPart is an overloaded String function. It is used to implement facilities in the debugger. In a dark place we find ourselves, and a little more knowledge lights our way. Converts a multi-line string to a single line of text

Replaces CR and LF characters in AText with Space characters. Duplicate Space characters in the return value are converted to a single Space character.

Text after removing end-of-line characters and duplicate spaces Text values examined and converted in the routine Inverts the case for characters in the specified text

Inverts the case for characters in the specified string value. Like using LowerCase and UpperCase simultaneously.

String with values after inverting the case for each character String with the characters converted in the routine Replaces (or appends) values the specified number of bytes at a give position

ReplaceSubstring is a procedure used to replace a portion of a string with the specified value.

Startpos contains the byte position in S where the substitution is performed. When StartPos is larger that the number of bytes in S, the value in Insertion is appended to the existing string value. The initial value in StartPos is 1.

Count contains the number of bytes in the string to be replaced in the routine. Count cannot exceed the number of bytes available starting at StartPos. No actions are performed in the routine when Count is 0 (zero) and the length of the Insertion parameter is 0 (zero).

ReplaceSubstring calls CompareMem to determine if the specified range in S and the value in Insertion have the same content. No actions are performed when the contain the same values.

The affected byte values in S are transferred by calling the System.Move routine. SetLength is called to update the new length for the string.

String with values examined and updated in the routine Initial byte position in the string where the substitution occurs Number of bytes replaced in the string Value inserted (or appended) to the value in the string Emulates the CASE .. OF statement for string values

StringCase is an overloaded Integer function used to emulate the Pascal CASE .. OF statement.

AString contains the value compared to the elements in the ACase array.

ACase is an array of String values which determine the result for the function.

AIgnoreCase indicates whether case is significant when comparing AString to elements in ACase. False indicates that case is not used in the comparison.

APartial indicates whether the value in AString can be a partial match for the value in an array element. When set to False, AString must match the array element exactly to be considered a match. When set to True, any value in ACase which starts with the value in AString is considered a match.

The return value contains the ordinal position of the element in ACase which matches the value in AString. The return value is -1 if no match was found for the value in AString.

Ordinal position for the case selector, or -1 when a match is not found Value compared to the array elements Determines the selector in the return value True if case is ignored in the comparison True is a partial match at the start of the selector is a match Returns True if P1 and P2 have the same content

Returns False if either P1 or P2 are unassigned (contain Nil).

True when P1 and P2 have the same content, or are the same pointer Pointer to characters compared in the routine Pointer to characters compared in the routine Like StrScan but compares only the specified number of characters in MaxLen

The return value is Nil when P is unassigned (contains Nil), or when P contains a terminating null character prior to finding a match for c before comparing the requested number of characters in MaxLen.

When c is located in P, the return value is a PChar pointer to the location where c was located.

Pointer to the character located in P, or Nil Pointer to characters examined in the routine Character to locate in the specified values Maximum number of characters examined in the routine This is a copy of IsValidIdent from FPC 3.1 Use the IsValidIdent routine from FPC 3.2.X when version 3.2 is the minimum requirement. Defines the maximum length for shortened or "ellipsified" text

Used in the ShortDotsLine routine.