Commit Graph

181 Commits

Author SHA1 Message Date
bart
7a69a2a702 LazUtf8: fix FindInvalidUtf8CodePoint for 3-byte encodings that encode for values reserved for UTF-16 surrogate halves.
git-svn-id: trunk@65246 -
2021-06-16 08:25:28 +00:00
maxim
6c7c5f4911 LazUtils: fixed typos related to 'occur' word
git-svn-id: trunk@65197 -
2021-06-10 22:10:27 +00:00
mattias
8396e2d0e0 lazutils: fixed ConvertUTF8ToUTF16 U+1FFFFF
git-svn-id: trunk@65182 -
2021-06-06 18:12:00 +00:00
mattias
63af733452 lazutils: fixed ConvertUTF8ToUTF16 U+1F600 and added tests
git-svn-id: trunk@65181 -
2021-06-06 17:50:54 +00:00
mattias
43ae6df320 lazutils: ConvertUTF8ToUTF16 and UTF8CodepointToUnicode check $10FFFF
git-svn-id: trunk@65166 -
2021-06-02 08:44:47 +00:00
mattias
6de8b92783 lazutils: UTF8FixBroken: fixing out of range and endless loop, added tests
git-svn-id: trunk@65163 -
2021-06-01 22:01:18 +00:00
mattias
c887c889e3 lazutils: FindInvalidUTF8Codepoint: check if bigger U+10FFFF
git-svn-id: trunk@65162 -
2021-06-01 21:02:23 +00:00
mattias
c54a8fa65a lazutils: less hints
git-svn-id: trunk@65012 -
2021-04-17 11:30:56 +00:00
juha
4c27531f6f LazUtils: Add function UTF8CompareTextP, use it in CompareFilenamesP.
git-svn-id: trunk@64391 -
2021-01-14 21:54:49 +00:00
juha
751852a44a LazUtils: New function UTF8CompareLatinTextFast. Use in IDE instead of UTF8CompareText.
git-svn-id: trunk@64385 -
2021-01-14 13:52:23 +00:00
juha
a9aa51a93d Tweak / optimization.
git-svn-id: trunk@64372 -
2021-01-10 19:07:46 +00:00
maxim
3d552017f1 LazUtils: commented out stray writeln, which was causing crashes on Windows after r64345 #865e21e88f
git-svn-id: trunk@64350 -
2021-01-08 00:43:07 +00:00
juha
865e21e88f LazUtils: Optimize UTF8CompareText when codepoints have one byte. Applies to most filename comparisons.
git-svn-id: trunk@64345 -
2021-01-07 13:43:47 +00:00
bart
f47a2a5fd6 LazUtils: Remove tests for FPC versions 2.x.
git-svn-id: trunk@64122 -
2020-11-11 14:03:48 +00:00
juha
4321fbf6e5 LazUtils: move procedure ReplaceSubstring from LazUTF8 to LazStringUtils.
git-svn-id: trunk@64081 -
2020-10-29 15:45:34 +00:00
bart
5812e23a64 LazUtf8: add some more C-escape strings for the Utf8EscapeControlChars function.
git-svn-id: trunk@64062 -
2020-10-24 10:03:39 +00:00
mattias
0eb446e94a lazutils: SysToUTF8: only UTF8_RTL, issue #35696, from Serge Anvarov
git-svn-id: trunk@61352 -
2019-06-10 16:19:00 +00:00
bart
642a3a9b68 LazUtils: change order of new Count parameter in Utf8StringReplace/Utf16StringReplace.
git-svn-id: trunk@60429 -
2019-02-15 15:52:00 +00:00
bart
c452fc00e6 LazUtils: add optional Count parameter to Utf8StringReplace/Utf16StringReplace.
git-svn-id: trunk@60426 -
2019-02-15 13:56:08 +00:00
juha
64a3cced51 LazUtils: Added inlines to some functions in LazUTF8. Issue #34472, patch from AlexeyT.
git-svn-id: trunk@59394 -
2018-10-30 11:04:21 +00:00
juha
c9e4614e17 Delete old deprecated methods.
git-svn-id: trunk@59175 -
2018-09-28 11:06:40 +00:00
bart
70f0e3209a LazUtf8: leftpad escaped characters in Utf8EscapeControlChars if EscapeMode = emPascal. Prevents ambiguity when reading the result.
git-svn-id: trunk@59125 -
2018-09-22 09:57:01 +00:00
juha
ed1cd9335d LazUtils: Add a new function UTF8ProperCase() to unit LazUTF8.
git-svn-id: trunk@56892 -
2017-12-31 08:49:05 +00:00
juha
f8be53b0e6 LazUtils: Change "Character" to "Codepoint" also in some parameter names in LazUTF8. Cleanup.
git-svn-id: trunk@56708 -
2017-12-13 00:07:00 +00:00
juha
6810c626df LazUtils: Change "Character" to "Codepoint" in LazUTF8 function names to be more accurate and to avoid confusion.
git-svn-id: trunk@56692 -
2017-12-11 19:44:22 +00:00
mattias
8fa91fbd06 lazutf8: fixed UTF8LowerCase CIRCLED LATIN CAPITAL LETTER K
git-svn-id: trunk@56665 -
2017-12-07 19:43:10 +00:00
juha
be0dcc0b50 LazUtils: Fix errors in LazUTF8.UTF8LowerCase. By forum user Munair.
git-svn-id: trunk@56662 -
2017-12-07 15:54:54 +00:00
juha
962f0fce09 LazUtils: Improve function UTF8LengthFast. From forum user "engkin".
git-svn-id: trunk@56572 -
2017-12-01 13:54:45 +00:00
juha
d10aed499e LazUtils: Improve function UTF8RPos. Use RPos and UTF8Length instead of reversing the whole string.
git-svn-id: trunk@56529 -
2017-11-28 21:29:05 +00:00
mattias
90dd28d142 lazutils: simplified
git-svn-id: trunk@56163 -
2017-10-23 09:35:35 +00:00
mattias
6e41e1e216 lazutils: fixed UTF8CharacterLengthFast
git-svn-id: trunk@55218 -
2017-06-04 20:18:38 +00:00
juha
e27232d4cc Fix uninitialized variables based on compiler warnings got with dfa (data flow analysis) enabled.
git-svn-id: trunk@55211 -
2017-06-04 15:14:29 +00:00
bart
a3a7c54e1e LazUtf8: fix compilation for WinCE. Issue #0031788.
git-svn-id: trunk@54845 -
2017-05-09 19:21:42 +00:00
juha
1e29783c40 More formatting.
git-svn-id: trunk@54377 -
2017-03-08 22:15:56 +00:00
mattias
63b12d5281 lazutf8: under Windows use W function for GetEnvironmentStringUTF8 and GetEnvironmentVariable
git-svn-id: trunk@54269 -
2017-02-25 12:24:25 +00:00
mattias
9d411abe20 docs: UnicodeToUTF8
git-svn-id: trunk@53964 -
2017-01-17 15:28:04 +00:00
juha
39fe54c5f6 Make LCL and LazUtils compile for Amiga systems (NoGUI). Issue #31186, patch from Marcus Sackrow.
git-svn-id: trunk@53853 -
2017-01-03 12:01:49 +00:00
bart
545d1bb66f LazUtf8: Fix UnicodeToUtf8 for CodePoint = 0. Issue #0031103.
git-svn-id: trunk@53659 -
2016-12-12 20:55:42 +00:00
mattias
eefe2518a1 lazutils: comment
git-svn-id: trunk@53300 -
2016-11-05 14:57:02 +00:00
bart
39750fff57 LazUtf8
- deprecate ValidUTF8String() (confusing name)
- implement Utf8EscapeControlChars()
Resolves Issue #0030821.

git-svn-id: trunk@53297 -
2016-11-04 14:23:20 +00:00
mattias
bc57de6bb9 lazutf8: improved UTF8CharacterLength and UTF8CharacterLengthFast
git-svn-id: trunk@52857 -
2016-08-21 21:14:01 +00:00
juha
413f000fc0 LazUtils: Return 1 also for char #0 in UTF8CharacterLengthFast. Matches the logic in UTF8CharacterLength.
git-svn-id: trunk@52856 -
2016-08-21 19:48:01 +00:00
juha
01c9a4b4d7 LazUtils: Add fast versions of UTF8CharacterLength and UTF8Length. Use them in LazUnicode unit.
git-svn-id: trunk@52853 -
2016-08-21 16:37:02 +00:00
mattias
824e8f1f9d lazutils: fixed compilation on non windows
git-svn-id: trunk@52481 -
2016-06-12 06:55:16 +00:00
ondrej
b08c38cba0 lazutils: fix GetFormatSettingsUTF8, make it public
git-svn-id: trunk@52479 -
2016-06-12 05:57:58 +00:00
juha
ba872ba5b1 LazUtils: Use cwstring always on unix systems. WideCompare* functions require it.
git-svn-id: trunk@52442 -
2016-06-05 08:26:52 +00:00
juha
2c41ccf609 Formatting, comment.
git-svn-id: trunk@52109 -
2016-04-06 10:19:12 +00:00
bart
7c9fc905a6 LazUtf8: In UTF8CompareStrCollated only call AnsiCompareStr is ACP_RTL is defined, since in all other cases
AnsiCompareStr = widestringmanager.CompareStrAnsiStringProc = UTF8CompareStr.
If ACP_RTL is not defined call Utf8CompareStr, since this is now does proper collation and is faster than
converting to WideString.

git-svn-id: trunk@51978 -
2016-03-17 12:07:57 +00:00
bart
77e5428b3f LazUtf8: firts attempt to rewrite Utf8CompareStr and Utf8CompareText so that it's results will be more consistent with
AnsiCompareStr/WideCompareStr and AnsiCompareTex/WideCompareText.
(
The old implementation was in effect a copy of CompareStr and, this made the claim about proper collation in
Utf8CompareText (which uses Utf8CompareStr) rather ludicrous.
The new implementaion is slower, mainly becaus of the fact we cannot use CompareMemrange/CompareByte anymore,
and we have to iterate the bytes ourselves. This fact alone contributes much more to the loss in speed than
the fact we use WideCompareStr on the 2 differing codepoints:
- iterating in a for loop: adss a factor of appr. 10 to the time needed
- using the final WideCompareStr adds a factor of about 1.6 to the time meeded.
Because of the slowdown in speed in Utf8CompareStr, Utf8CompareText now calls WideCompareText directly, which is
now appr. the same speed as converting to lowercase and then calling Utf8CompareStr
)

git-svn-id: trunk@51977 -
2016-03-17 11:38:56 +00:00
bart
b192fb9760 LazUtf8: Refactor UTF8FindNearestCharStart. Resolves Issue #0029851.
git-svn-id: trunk@51973 -
2016-03-17 10:42:52 +00:00