Xojo Developer Conference
25/27th April 2018 in Denver.
MBS Xojo Conference
6/7th September 2018 in Munich, Germany.

DynaPDF Manual - Page 15

Previous Page 14   Index   Next Page 16

Data types
Page 15 of 750
ToUTF16 is defined as follows:
// declared in drv_conf.h (Linux/UNIX, Mac OS X)
#if (SIZEOF_WCHAR_T == 4)
#define ToUTF16(IPDF, s)(pdfUTF32ToUTF16((IPDF), (UI32*)(s)))
// UTF-16
#define ToUTF16(IPDF, s)((s))
This macro calls pdfUTF32ToUTF16() only if the OS uses UTF-32 as Unicode string format.
On operating systems which use already UTF-16, no conversion is applied; the macro will be
removed by the compiler. The function pdfUTF32ToUTF16() holds an array of 4 independent
string buffers so that the macro can be used in functions which support up to four string
parameters. If DynaPDF will ever support a function with more than 4 string parameters, the
number of internal string buffers will be incremented.
However, take care when using the macro to initialize string variables of structures which contain
more than 6 string members:
myStruct.String1 = ToUTF16(pdf, L”String1”); // OK
myStruct.String2 = ToUTF16(pdf, L”String2”); // OK
myStruct.String3 = ToUTF16(pdf, L”String3”); // OK
myStruct.String4 = ToUTF16(pdf, L”String4”); // OK
myStruct.String5 = ToUTF16(pdf, L”String5”); // OK
myStruct.String6 = ToUTF16(pdf, L”String6”); // OK
myStruct.String7 = ToUTF16(pdf, L”String7”); // Wrong!
The seventh call above overrides the string buffer of String1 because only 6 internal string buffers
are available. If you need to store more than 6 string variables then you must copy the converted
string into another variable!
Unicode File Paths
Unicode file paths are encoded differently depending on the used operating system. While NT
based Windows system use UTF-16 encoded Unicode file paths, non-Windows systems use
usually UTF-8 encoded Unicode file paths. All DynaPDF functions which open a file convert
UTF-16 strings to UTF-8 on non-Windows operating systems. However, to avoid this conversion
step it is usually best to use directly the Ansi version of a function and passing an UTF-8 file path
to it.
CJK Multi-byte Strings
CJK multi-byte strings contain mixed 8 bit / 16 bit character codes. A CJK string can be defined as
an Ansi string (data type char*) and as multi-byte string (data type UI16*). The multi-byte format

Previous topic: Unicode

Next topic: Data types used by different programming languages