Returns the codepoint at p and the number of bytes to skip. If p=nil then CharLen and result are 0 otherwise CharLen>0. If there is an encoding error the Result is not defined, but CharLen is still>0 to skip forward. It is safe to do: var s: string; p:=1; while p<=length(s) do begin UTF8CharacterToUnicode(@s[p],CharLen); inc(p,CharLen); end; For speed reasons this function only checks for 1,2,3,4 byte encoding errors. Especially it does not check if the codepoint is defined in the unicode table. Encodes the given codepoint as an UTF-8 sequence of 1 to 4 bytes. It does not add a #0. Simple and fast function to write a single unicode codepoint as UTF-8 to Buf and returns the number of bytes written It does not append a #0. It does not check if it is the codepoint actually exists in unicode tables. It returns 0 if the codepoint can not be represented as a 1 to 4 byte UTF-8 sequence. Replaces all invalid UTF8 characters with spaces. Stops at #0. Removes space at start and endIt removes spaces, tabs, line breaks and control characters at start and end. Use Flags to only delete at start or only at end or to to not delete line breaks. Control characters are the unicode sets C0 and C1 and the left-to-right and right-to-left marks.