Xojo Developer Conference
25/27th April 2018 in Denver.
MBS Xojo Conference
6/7th September 2018 in Munich, Germany.

DynaPDF Manual - Page 441

Previous Page 440   Index   Next Page 442

Function Reference
Page 441 of 750
How to find and replace text in a page?
As mentioned in the previous sections text replacement or text search algorithms are not easy to
develop because many things must be considered to get suitable results. To make the development
easier DynaPDF is delivered with several example projects which are available in C++, Delphi, VB
.Net, and C#. You should take a look into the examples text_extraction, text_extraction2, edit_text,
and text_search to determine how the function GetPageText() can be used.
If you need only a text search algorithm it is better to use the content parser of DynaPDF directly
because it is faster than GetPageText() (see the example text_search for further information).
The class CPDFEditText() (used in the example edit_text) contains already a rather complex and
complete text replacement algorithm that demonstrates how the functions ReplacePageText() and
ReplacePageTextEx() can be used. You should try to understand how this algorithm works so that
you can extend it. This class demonstrates especially how space characters can be identified and how
they must be handled when replacing texts. However, note that PDF files are generally not designed
to edit existing contents. Existing text should only be replaced if there is no other way to achieve the
same result or if only minor changes must be applied, e.g. replacing a misspelled word.
Remarks:
GetPageText() parses the content stream of the currently open page or template as it is at time of
executing the function. The content stream contains all operators and values which were added
beforehand with DynaPDF functions incl. the contents of imported PDF files. If texts should be
replaced or deleted it is usually best to process imported page(s) before adding new contents.
Return values:
If the function succeeds and if further records are available the return value is 1. If the function fails
or if no further records are available the return value is 0.
If a content stream contains no text the return value is 0 and the members TextLen and KerningCount
are set to 0. If a content stream contains only one text record the return value is 0 that means that no
further records are available but the members TextLen and KerningCount are set to values greater 0.
GetPageWidth
Syntax:
double pdfGetPageWidth(
const PPDF* IPDF) // Instance pointer
The function returns the width of the currently open page. If no open page can be detected the return
value is the default width which will be used for newly created pages. The page width refers to the
media box of a page. The real size is maybe smaller if a crop box is present. The crop takes
precedence because it crops the media box.
If SetUseVisibleCoords() was set to true, the function checks whether a cop box is present and
returns the size of this box if set. A PDF unit represents 1/72 inch. See also GetBBox().
 

Previous topic: How to calculate the rotation angle?

Next topic: GetPrintSettings