Commit Graph

33 Commits

Author SHA1 Message Date
Michael VAN CANNEYT
d2d3fe6bc3 * Char -> AnsiChar 2023-07-14 17:26:10 +02:00
marcoonthegit
4568c77f57 * implemented stringofchar unicodestring, bug #39483 2021-12-14 15:07:43 +01:00
nickysn
800bb3adc2 * instead of using discardresult, wrap the uniquestring functions in procedures,
that are declared as inline

git-svn-id: trunk@49016 -
2021-03-19 21:13:20 +00:00
nickysn
df947d3ae8 * fixed rtl compilation with FPC 3.2.0 starting compiler
git-svn-id: branches/wasm@48305 -
2021-01-22 00:48:44 +00:00
nickysn
7e958e0a35 + introduced the discardresult directive and declared the UniqueString()
overloads, using this directive

git-svn-id: branches/wasm@48283 -
2021-01-21 21:42:07 +00:00
michael
257ef24a1e * Fix bug ID #38008: allow UTF8 to unicode conversion to react on/ignore invalid input
git-svn-id: trunk@47391 -
2020-11-12 09:17:09 +00:00
michael
213d2905df * Change some methods from using var to using out (bug ID 37376)
git-svn-id: trunk@46533 -
2020-08-21 10:44:48 +00:00
svenbarth
48b8110e38 * disable UCS4String if dynamic arrays are disabled
git-svn-id: trunk@42455 -
2019-07-19 11:52:51 +00:00
svenbarth
a2c9c75e97 Convert Insert() and Delete() to intrinsics in preparation for dynamic array support for these two procedures.
Since overloading compilerprocs does not work each procedure got its own unique name, but they are using the new compilerproc extension to map them to the Insert and Delete symbol so that error messages can be shown with the respective name for the procedure declarations instead of fpc_shortstr_delete for example.

git-svn-id: trunk@33895 -
2016-06-03 21:25:49 +00:00
Jonas Maebe
a100309350 * made utf8tostring() Delphi-compatible (mantis #29585):
o removed utf8string overload
   o always ignore any code page information from the input, and interpret the
     contents of the input directly as utf8-encoded bytes
 * made utf8tostring() compatible with the JVM backend (mantis #29497)

git-svn-id: trunk@33159 -
2016-03-05 15:32:22 +00:00
michael
6d051892e5 * Added Offset argument to Pos (exists in wide and ansi/short, forgotten for unicode)
git-svn-id: trunk@33056 -
2016-02-06 12:16:00 +00:00
michael
ee2f34588d * Fix bug #28258, missing UTF8ToString, patch by Stocki
git-svn-id: trunk@32824 -
2016-01-01 17:27:44 +00:00
Jonas Maebe
21a4a9034d * commented out all untested (and on non-Win32: unimplemened) TCompareOption
flags

git-svn-id: trunk@32275 -
2015-11-08 19:43:26 +00:00
michael
fde42ca8ef * Type helpers, compatible to Delphi XE 8
git-svn-id: trunk@32262 -
2015-11-07 09:40:54 +00:00
Jonas Maebe
ff583bde6c * made setstring() a compiler intrinsic so it can set the compile-time
code page of ansistrings (mantis #26735)

git-svn-id: trunk@28813 -
2014-10-12 20:27:06 +00:00
Jonas Maebe
aaa5bb48de + ToSingleByteFileSystemEncodedFileName(array of widechar) overload for more
efficient operation in certain platform's dos units (avoid
    array-of-widechar -> unicodestring conversion)

git-svn-id: branches/cpstrrtl@25138 -
2013-07-19 16:31:21 +00:00
Jonas Maebe
3694b4f003 * moved ToSingleByteFileSystemEncodedFileName() to the system unit and
export it so it can be used in all rtl units

git-svn-id: branches/cpstrrtl@24960 -
2013-06-24 09:40:00 +00:00
Jonas Maebe
598d2feeb6 + rtldefs.inc file for every target that contains defines shared by multiple
RTL units. Comes with a FPCRTL_FILESYSTEM_UTF8 define that can be
    activated for targets whose single byte filesystem interface enforces
    UTF-8; included in inc/systemh.inc and unix/cwstring.pp until now
  + DefaultFileSystemCodePage variable that holds the code page used for
    communicating with the OS single byte file system APIs, and for the
    strings returned by those same APIs. Initialized with
   o the result of GetACP in the system unit of Windows platforms, except for
     WinCE which uses UTF-8 since its file system OS API calls already use
     the UTF-16 versions
   o CP_UTF8 on Unix platforms with FPCRTL_FILESYSTEM_UTF8 defined, and with
     DefaultSystemCodePage on other Unix platforms
   o DefaultSystemCodePage on Java/Android JVM targets
  + DefaultRTLFileSystemCodePage variable that holds the code page used to
    encode strings returned by RTL routines that return filenames obtained
    from OS API calls. By default the same as DefaultFileSystemCodePage on
    all platforms. Separate from DefaultFileSystemCodePage for clarity on
    platforms that may use either utf-16 or single byte OS API calls to
    send/receive file names (such as most Windows platforms)
  + new scpFileSystemSingleByte enum that can be passed to
    GetStandardCodePage() to get the default code page for OS single byte file
    system APIs, with implementations for Unix and Windows
  + SetMultiByteFileSystemCodePage() procedure to override the value of
    DefaultFileSystemCodePage

  In principle, in the long run unchanged programs only using generic
  ansistrings and unicodestrings should (mostly) behave the same as in
  FPC 2.6.0 as far as RTL-level file system APIs are concerned if
  they set DefaultFileSystemCodePage and DefaultRTLFileSystemCodePage
  to DefaultSystemCodePage at the start of their execution

git-svn-id: branches/cpstrrtl@22466 -
2012-09-27 07:54:06 +00:00
Jonas Maebe
aee5380ae0 * merged trunk up to r20882
o support for the new codepage-aware ansistrings in the jvm branch
   o empty ansistrings are now always represented by a nil pointer rather than
     by an empty string, because an empty string also has a code page which
     can confuse code (although this will make ansistrings harder to use
     in Java code)
   o more string helpers code shared between the general and jvm rtl
   o support for indexbyte/word in the jvm rtl (warning: first parameter
     is an open array rather than an untyped parameter there, so
     indexchar(pcharvar^,10,0) will be equivalent to
     indexchar[pcharvar^],10,0) there, which is different from what is
     intended; changing it to an untyped parameter wouldn't help though)
   o default() support is not yet complete
   o calling fpcres is currently broken due to limitations in
     sysutils.executeprocess() regarding handling unix quoting and
     the compiler using the same command lines for scripts and directly
     calling external programs
   o compiling the Java compiler currently requires adding ALLOW_WARNINGS=1
     to the make command line

git-svn-id: branches/jvmbackend@20887 -
2012-04-15 15:54:10 +00:00
sergei
7ff76caa73 * Removed 'inline' attribute from 6 overloaded pos() functions which contain a managed typecast. Inlining it leads to noticeable increase in code size without any sensible speed improvement.
* Added 'const' modifier to the first argument of these functions in order to avoid creating a local copy.

git-svn-id: trunk@20207 -
2012-02-01 13:34:36 +00:00
paul
270fb09e87 rtl: add WideStringManager.GetStandardCodePageProc method to retrieve system ansi and console code pages
git-svn-id: trunk@19539 -
2011-10-25 01:39:11 +00:00
paul
6384fa2a19 rtl: revert r19330. We probably need to create a separate encoding<->codepage table.
git-svn-id: trunk@19332 -
2011-10-03 10:28:14 +00:00
paul
a0e7196ae9 rtl: move winiconv.inc into general inc directory and rename it to wincodepages.inc, also rename win2iconv, iconv2win to CodePageToCodePageName, CodePageNameToCodePage.
This change is required since CodePage to CodePage name conversions are required in other parts of RTL. Moreover those codepage identifiers are windows codepage identifiers and thus must be compatible with codepage identifiers used by delphi.

git-svn-id: trunk@19330 -
2011-10-03 03:35:45 +00:00
paul
631c545423 merge r19075 from cpstrnew branch by paul:
rtl: change UTF8Decode, UTF8Encode, AnsiToUTF8, UTF8ToAnsi to use RawByteString as arguments/result for compatibility with the old code and also with delphi

git-svn-id: trunk@19128 -
2011-09-17 14:30:17 +00:00
paul
ad8195e9ae merge r14136 from cpstrnew branch by paul:
- fix return type of StringCodePage functions from Word to TSystemCodePage
- add SetMultiByteConversionCodePage procedure (which just change the DefaultSystemCodePage constant)

git-svn-id: trunk@19095 -
2011-09-17 11:46:33 +00:00
paul
2162add8ac merge r14132 from cpstrnew branch by paul:
- a set of rtl changes from AnsiString to RawByteString to various conversion functions
- a test which proves output in cp1251 and cp866 codepages (standard for Russian windows)

git-svn-id: trunk@19093 -
2011-09-17 11:39:13 +00:00
paul
ae0d732c8f merge r13485 from cpstrnew branch by florian:
* fixed compilation of system unit after last changes

git-svn-id: trunk@19083 -
2011-09-17 11:01:42 +00:00
paul
28627482c5 merge r13483 from cpstrnew branch by florian:
+ Win32Unicode2AnsiMove and Win32Wide2AnsiMove support code page parameter
+ Win32Ansi2UnicodeMove and Win32Ansi2WideMove support code page parameter
+ code page parameter added for several compilerprocs
* unified more code between win32 and win64 (widestring conversion routines

git-svn-id: trunk@19082 -
2011-09-17 10:54:00 +00:00
paul
8a4634a7b1 merge r13481 from cpstrnew branch by florian
+ support parsing of strings with code page specification
+ added encoding and elementsize field to ansi- and unicodestring records
+ some basic rtl support routines for encoding aware strings
+ DefaultSystemCodePage
+ DefaultUnicodeCodePage
+ ppu writing/loading of code page aware strings

git-svn-id: trunk@19080 -
2011-09-17 10:37:36 +00:00
Jonas Maebe
3a423b331c * full implementation of all routines in rtl/inc/ustringh.inc (except for
val/str for enums for now) for the JVM target: insert/delete/pos/...
  * use generic unicodestring helper routines where possible for the JVM
    target (not that many as for shortstrings since unicodestring is
    handled using java.lang.String)
  + complete widestring manager implementation for the JVM target. It uses
    a class with virtual methods rather than a record with function pointers
    for speed reasons though (since no existing widestring manager will be
    compatible anyway, that shouldn't cause any problems)

git-svn-id: branches/jvmbackend@18882 -
2011-08-28 19:22:22 +00:00
Jonas Maebe
1403e3df29 * renamed fpc_WChar_To_ShortStr() compilerproc to fpc_UChar_To_ShortStr() for
consistency with other helpers
  + added lowercase(unicodechar) and lowercase(unicodestring) overloads (for some
    reason only upcase() existed for them)

git-svn-id: branches/jvmbackend@18881 -
2011-08-28 19:22:15 +00:00
Jonas Maebe
f4c31ecf3c + widestringmanager.codepointlengthproc added, which can be used to
determine the length of a multi-byte character. The return values
    are defined to be the same as those of POSIX' mblen: -1 =
    invalid/incomplete sequence, 0 = #0, > 0 = length of sequence in
    bytes.
  + default implementation for widestringmanager.codepointlengthproc
    (assumes all code points have length 1) and Unix implementation
    (based on mb(r)len); Windows implementation is still required
  * replaced default implementation of
    widestringmanager.CharLengthPCharProc with strlen() of the input
    instead of an error (correct if all code points have length 1,
    still needs Windows implementation)
  + implemented fpc_text_read_{wide,unicode}str() and
    fpc_text_read_widechar() (mantis #18163); fpc_text_read_widechar()
    uses the new widestringmanager.codepointlengthproc()
  + unicodestring support for readstr/writestr
  * fixed declaration of fpc_Write_Text_UnicodeStr (unicodestring
    instead of widestring parameter)
  * extended test/twide*.pp tests to test the new/fixed functionality

git-svn-id: trunk@16533 -
2010-12-10 14:10:01 +00:00
florian
b178b08ba7 Merged revisions 11665-11738 via svnmerge from
http://svn.freepascal.org/svn/fpc/branches/unicodestring

........
  r11665 | florian | 2008-08-30 13:30:17 +0200 (Sat, 30 Aug 2008) | 1 line
  
  * continued to work on unicodestring type support
........
  r11666 | florian | 2008-08-30 19:02:26 +0200 (Sat, 30 Aug 2008) | 2 lines
  
  * expectloc for wide/ansi/unicode strings is LOC_CONSTANT or LOC_REGISTER now
........
  r11667 | florian | 2008-08-30 20:42:37 +0200 (Sat, 30 Aug 2008) | 1 line
  
  * more unicodestring stuff fixed, test results on win32 are already good
........
  r11670 | florian | 2008-08-30 23:21:48 +0200 (Sat, 30 Aug 2008) | 2 lines
  
  * first fixes for unix bootstrapping
........
  r11683 | ivost | 2008-09-01 12:46:39 +0200 (Mon, 01 Sep 2008) | 2 lines
  
      * fixed 64bit bug in iconvenc.pas
........
  r11689 | florian | 2008-09-01 23:12:34 +0200 (Mon, 01 Sep 2008) | 1 line
  
  * fixed several errors when building on unix
........
  r11694 | florian | 2008-09-03 20:32:43 +0200 (Wed, 03 Sep 2008) | 1 line
  
  * fixed unix compilation
........
  r11695 | florian | 2008-09-03 21:01:04 +0200 (Wed, 03 Sep 2008) | 1 line
  
  * bootstrapping fix
........
  r11696 | florian | 2008-09-03 21:07:18 +0200 (Wed, 03 Sep 2008) | 1 line
  
  * more bootstrapping fixed
........
  r11698 | florian | 2008-09-03 22:47:54 +0200 (Wed, 03 Sep 2008) | 1 line
  
  + two missing compiler procs exported
........
  r11701 | florian | 2008-09-04 16:42:34 +0200 (Thu, 04 Sep 2008) | 2 lines
  
  + lazarus project for the linux rtl
........
  r11702 | florian | 2008-09-04 16:43:27 +0200 (Thu, 04 Sep 2008) | 2 lines
  
  + set unicode string procedures
........
  r11707 | florian | 2008-09-04 23:23:02 +0200 (Thu, 04 Sep 2008) | 2 lines
  
  * fixed several type casting stuff
........
  r11712 | florian | 2008-09-05 22:46:03 +0200 (Fri, 05 Sep 2008) | 1 line
  
  * fixed unicodestring compilation on windows after recent unix changes
........
  r11713 | florian | 2008-09-05 23:35:12 +0200 (Fri, 05 Sep 2008) | 1 line
  
  + UnicodeString support for Variants
........
  r11715 | florian | 2008-09-06 20:59:54 +0200 (Sat, 06 Sep 2008) | 1 line
  
  * patch by Martin Schreiber for UnicodeString streaming
........
  r11716 | florian | 2008-09-06 22:22:55 +0200 (Sat, 06 Sep 2008) | 2 lines
  
  * fixed test
........
  r11717 | florian | 2008-09-07 10:25:51 +0200 (Sun, 07 Sep 2008) | 1 line
  
  * fixed typo when converting tunicodestring to punicodechar
........
  r11718 | florian | 2008-09-07 11:29:52 +0200 (Sun, 07 Sep 2008) | 3 lines
  
  * fixed writing of UnicodeString properties
  * moved some helper routines to unicode headers
........
  r11734 | florian | 2008-09-09 22:38:55 +0200 (Tue, 09 Sep 2008) | 1 line
  
  * fixed bootstrapping
........
  r11735 | florian | 2008-09-10 11:25:28 +0200 (Wed, 10 Sep 2008) | 2 lines
  
  * first fixes for persisten unicodestrings
........
  r11736 | florian | 2008-09-10 14:31:00 +0200 (Wed, 10 Sep 2008) | 3 lines
  
  Initialized merge tracking via "svnmerge" with revisions "1-11663" from 
  http://svn.freepascal.org/svn/fpc/trunk
........
  r11737 | florian | 2008-09-10 21:06:57 +0200 (Wed, 10 Sep 2008) | 3 lines
  
  * fixed unicodestring <-> variant handling
  * fixed unicodestring property reading
........

git-svn-id: trunk@11739 -
2008-09-10 20:14:31 +00:00