Xojo Conferences
MBSOct2019CologneDE

DynaPDF Manual - Page 543

Previous Page 542   Index   Next Page 544

Function Reference
Page 543 of 770
TSetMiterLimit*
SetMiterLimit;
TSetStrokeColor*
SetStrokeColor;
TSetTextDrawMode*
SetTextDrawMode;
TSetTextScale*
SetTextScale;
TSetWordSpacing*
SetWordSpacing;
void*
Reserved001;
// See comment below
void*
Reserved002;
// See comment below
TShowTextArrayW*
ShowTextArrayW; // Preferred for text extraction
TInsertImage*
InsertImage;
TShowTextArrayA*
ShowTextArrayA; // Preferred for text searching
// Additional reserved members follow (must be set to NULL).
};
typedef UI32 TParseFlags;
#define pfNone
0x00000000 // Default
#define pfDecomprAllImages 0x00000002 // See description
#define pfNoJPXDecode
0x00000004 // See description
#define pfDitherImagesToBW 0x00000008 // Floyd-Steinberg dithering.
#define pfConvImagesToGray 0x00000010 // See description
#define pfConvImagesToRGB
0x00000020 // See description
#define pfConvImagesToCMYK 0x00000040 // See description
#define pfImageInfoOnly
0x00000080 // See description
The function parses the content stream of the current open page. The content parser can be used to
extract text, images, and vector graphics from a PDF file. The parameter Stack holds a set of callback
functions which are executed if corresponding operators were found in the content stream. The
parameter Data is a user defined pointer that is passed unchanged to the callback functions.
All callback functions are optional. Which callback functions must be set depends on the kind of
information that should be extracted. For example, an application that extracts images must at least
provide the callback function TInsertImage.
All callback functions which return an integer value can break processing if necessary. A return
value of zero indicates success and processing continues. A return value of 1 of the TBeginTemplate
or TBeginPattern callback functions indicates that the object should be skipped. The corresponding
content streams are not executed in this case. This can be useful when extracting images. Any other
return value breaks processing.
Notice:
It is allowed to write arbitrary objects into the page while the content parser is executed but it is
strongly required to check whether a fatal error occurred when writing something to the page.
The callback function must return a negative value in such a case to break processing. This is
required because the parser doesn't notice when a fatal error occurs. New objects will be ignored
when parsing a page.
ParseContent() is already part of DynaPDF since version 2.0.30 but it was never documented.
Because the function was undocumented, a few important changes were made in DynaPDF 2.5
which do not break backward compatibility. However, an application that uses the following
features must be slightly changed when it is recompiled:
 

Previous topic: ParseContent

Next topic: The Graphics State