DynaPDF Manual - Page 556

Previous Page 555   Index   Next Page 557

Function Reference
Page 556 of 777
Character Spacing
As described above the current character spacing is already considered in the text width that is
provided in all text callback functions. However, the value must be stored in the graphics state if the
width of a sub string must be computed. Character spacing is measured in unscaled font units. The
required transformation to text space is done in functions like GetTextWidth().
Word Spacing
Like character spacing, the current word spacing is already considered in the text width that is
provided in all text callback functions. However, word spacing applies to the space character of
simple fonts only.
An application that extracts text from PDF files maybe wants to preserve the original formatting of
the text. In this case, the distance between two words in the same text record must be known, e.g. to
insert a number of spaces to emulate the word spacing.
However, note that the current word spacing must be ignored if the font type is ftType0 (the font
type is a parameter of the graphics state and is set with the TSetFont callback function).
Another thing that must be considered is that word and character spacing are measured in unscaled
font units. The width of a space character including word spacing can be calculated with the function
GetTextWidth() that is part of the font API (the name is fntGetTextWidth() in C/C++).
An algorithm that considers word spacing must check whether the source string contains space
characters. If a space was found, the width of the sub string that occurs before must be calculated so
that the start and end point of the word can be calculated. Additional spaces can be skipped and the
cursor position is updated to the position behind the spaces. Processing continues until the entire
text of the record was processed.
An algorithm that processes text in this way calculates essentially the start and end coordinates of
every text part that is either separated by spaces or kerning space.
The required source code looks as follows (C++):
// The following code fragment uses the TShowTextArrayW callback function.
SI32 parseShowTextArrayW(const void* Data, const struct TTextRecordA*
Source, struct TCTM* Matrix, const struct TTextRecordW* Kerning, UI32
Count, double Width, LBOOL Decoded)
{
if (!Decoded) return 0;
double x1 = 0.0;
double y1 = 0.0;
// Transform the text matrix to user space
TCTM m = MulMatrix(m_GState.Matrix, *Matrix);
Transform(m, x1, y1); // Start point of the text record
// Word spacing applies to simple fonts only.
if (m_GState.FontType != ftType0)
{
UI32 i, j, last;
double x2 = 0.0;
 

Previous topic: Text Width

Next topic: Text Scaling, Sub string coordinates