Table of Contents

Class TaggedPdfReaderTool

Namespace
iText.Kernel.Utils
Assembly
itext.kernel.dll

Converts a tagged PDF document into an XML file.

public class TaggedPdfReaderTool
Inheritance
TaggedPdfReaderTool
Inherited Members

Constructors

TaggedPdfReaderTool(PdfDocument)

Constructs a TaggedPdfReaderTool via a given PdfDocument.

public TaggedPdfReaderTool(PdfDocument document)

Parameters

document PdfDocument

the document to read tag structure from

Fields

document

protected PdfDocument document

Field Value

PdfDocument

out

protected StreamWriter @out

Field Value

StreamWriter

parsedTags

protected IDictionary<PdfDictionary, IDictionary<int, string>> parsedTags

Field Value

IDictionary<PdfDictionary, IDictionary<int, string>>

rootTag

protected string rootTag

Field Value

string

Methods

ConvertToXml(Stream)

Converts the current tag structure into an XML file with default encoding (UTF-8).

public virtual void ConvertToXml(Stream os)

Parameters

os Stream

the output stream to save XML file to

ConvertToXml(Stream, string)

Converts the current tag structure into an XML file with provided encoding.

public virtual void ConvertToXml(Stream os, string charset)

Parameters

os Stream

the output stream to save XML file to

charset string

the charset of the resultant XML file

EscapeXML(string, bool)

NOTE: copied from itext5 XMLUtils class Escapes a string with the appropriated XML codes.

protected static string EscapeXML(string s, bool onlyASCII)

Parameters

s string

the string to be escaped

onlyASCII bool

codes above 127 will always be escaped with &#nn; if true

Returns

string

the escaped string

FixTagName(string)

Fixes specified tag name to be valid XML tag.

protected static string FixTagName(string tag)

Parameters

tag string

tag name to fix

Returns

string

fixed tag name.

InspectAttributes(PdfStructElem)

Inspects attributes dictionary of the StructTreeRoot child.

protected virtual void InspectAttributes(PdfStructElem kid)

Parameters

kid PdfStructElem

the direct kid of the StructTreeRoot

InspectKid(IStructureNode)

Inspect the child of the StructTreeRoot.

protected virtual void InspectKid(IStructureNode kid)

Parameters

kid IStructureNode

the direct kid of the StructTreeRoot

InspectKids(IList<IStructureNode>)

Inspect the children of the StructTreeRoot.

protected virtual void InspectKids(IList<IStructureNode> kids)

Parameters

kids IList<IStructureNode>

list of the direct kids of the StructTreeRoot

IsValidCharacterValue(int)

Checks if a character value should be escaped/unescaped.

public static bool IsValidCharacterValue(int c)

Parameters

c int

a character value

Returns

bool

true if it's OK to escape or unescape this value.

ParseTag(PdfMcr)

Parses tag of the Marked Content Reference (MCR) kid of the StructTreeRoot.

protected virtual void ParseTag(PdfMcr kid)

Parameters

kid PdfMcr

the direct PdfMcr kid of the StructTreeRoot

SetRootTag(string)

Sets the name of the root tag of the resultant XML file

public virtual TaggedPdfReaderTool SetRootTag(string rootTagName)

Parameters

rootTagName string

the name of the root tag

Returns

TaggedPdfReaderTool

this object