Xojo Developer Conference
25/27th April 2018 in Denver.
MBS Xojo Conference
6/7th September 2018 in Munich, Germany.

DynaPDF Manual - Page 536

Previous Page 535   Index   Next Page 537

Function Reference
Page 536 of 750
must be computed. The parameters must be available in the graphics state including the
corresponding callback functions which set the parameters.
The real text width measured in user space can be calculated as follows:
double x1 = 0.0;
double y1 = 0.0;
double x2 = Width; // Width is a parameter of the callback function
double y2 = 0.0;
// Transform the text matrix to user space
TCTM m = MulMatrix(m_GState.Matrix, *Matrix);
Transform(m, x1, y1); // Start point of the text record
Transform(m, x2, y2); // End point of the text record
double realTextWidth = CalcDistance(x1, y1, x2, y2);
The end point of a text record is usually required to determine whether the next record lies on the
same line. An algorithm that is able to construct text lines in arbitrary rotated coordinate systems is
provided in the example Text Extraction which is delivered with all DynaPDF versions.
Character Spacing
As described above the current character spacing is already considered in the text width that is
provided in all text callback functions. However, the value must be stored in the graphics state if the
width of a sub string must be computed. Character spacing is measured in unscaled font units. The
required transformation to text space is done in functions like GetTextWidth().
Word Spacing
Like character spacing, the current word spacing is already considered in the text width that is
provided in all text callback functions. However, word spacing applies to the space character of
simple fonts only.
An application that extracts text from PDF files maybe wants to preserve the original formatting of
the text. In this case, the distance between two words in the same text record must be known, e.g. to
insert a number of spaces to emulate the word spacing.
However, note that the current word spacing must be ignored if the font type is ftType0 (the font
type is a parameter of the graphics state and is set with the TSetFont callback function).
Another thing that must be considered is that word and character spacing are measured in unscaled
font units. The width of a space character including word spacing can be calculated with the function
GetTextWidth() that is part of the font API (the name is fntGetTextWidth() in C/C++).
An algorithm that considers word spacing must check whether the source string contains space
characters. If a space was found, the width of the sub string that occurs before must be calculated so
that the start and end point of the word can be calculated. Additional spaces can be skipped and the
cursor position is updated to the position behind the spaces. Processing continues until the entire
text of the record was processed.
 

Previous topic: Text Width

Next topic: Sub string coordinates