Table of Contents

Class RtfParser

Namespace
iTextSharp.text.rtf.parser
Assembly
iTextSharp.LGPLv2.Core.dll

The RtfParser allows the importing of RTF documents or RTF document fragments. The RTF document or fragment is tokenised, font and color definitions corrected and then added to the document being written. @author Mark Hall (Mark.Hall@mail.room3b.eu) @author Howard Shank (hgshank@yahoo.com) @since 2.0.8

public class RtfParser
Inheritance
RtfParser
Inherited Members

Constructors

RtfParser(Document)

Constructor @since 2.1.3

public RtfParser(Document doc)

Parameters

doc Document

Fields

DESTINATION_NORMAL

Destination is normal. Text is processed.

public const int DESTINATION_NORMAL = 0

Field Value

int

DESTINATION_SKIP

Destination is skipping. Text is ignored.

public const int DESTINATION_SKIP = 1

Field Value

int

PARSER_ERROR

ERRORS

public const int PARSER_ERROR = -2147483648

Field Value

int

PARSER_ERROR_EOF

The parser reached the end of the file.

public const int PARSER_ERROR_EOF = -2147483647

Field Value

int

PARSER_IN_BLIPUID

Currently a blipuid control word is being parsed.

public const int PARSER_IN_BLIPUID = 536870931

Field Value

int

PARSER_IN_CHARSET

Currently the RTF charset is being parsed.

public const int PARSER_IN_CHARSET = 1

Field Value

int

PARSER_IN_COLOR_TABLE

Currently the RTF color table is being parsed.

public const int PARSER_IN_COLOR_TABLE = 6

Field Value

int

PARSER_IN_DEFFONT

Currently the RTF deffont is being parsed.

public const int PARSER_IN_DEFFONT = 2

Field Value

int

PARSER_IN_DOCUMENT

Currently the RTF document content is being parsed.

public const int PARSER_IN_DOCUMENT = 536870912

Field Value

int

PARSER_IN_FILE_TABLE

Currently the RTF filetbl is being parsed.

public const int PARSER_IN_FILE_TABLE = 5

Field Value

int

PARSER_IN_FONT_TABLE

Currently the RTF font table is being parsed.

public const int PARSER_IN_FONT_TABLE = 3

Field Value

int

PARSER_IN_FONT_TABLE_INFO

Currently a RTF font table info element is being parsed.

public const int PARSER_IN_FONT_TABLE_INFO = 4

Field Value

int

PARSER_IN_GENERATOR

Currently the RTF generator is being parsed.

public const int PARSER_IN_GENERATOR = 12

Field Value

int

PARSER_IN_HEADER

Currently the RTF document header is being parsed.

public const int PARSER_IN_HEADER = 0

Field Value

int

PARSER_IN_INFO_GROUP

Document state values

public const int PARSER_IN_INFO_GROUP = 536870913

Field Value

int

PARSER_IN_LATENTSTYLES

Currently the Latent Style and Formatting usage restrictions

public const int PARSER_IN_LATENTSTYLES = 21

Field Value

int

PARSER_IN_LISTOVERRIDE_TABLE

Currently the RTF listtable override is being parsed.

public const int PARSER_IN_LISTOVERRIDE_TABLE = 9

Field Value

int

PARSER_IN_LIST_TABLE

Currently the RTF listtables is being parsed.

public const int PARSER_IN_LIST_TABLE = 8

Field Value

int

PARSER_IN_OLDCPROPS

Currently the RTF Old Properties.

public const int PARSER_IN_OLDCPROPS = 15

Field Value

int

PARSER_IN_OLDPPROPS

Currently the RTF Old Properties.

public const int PARSER_IN_OLDPPROPS = 16

Field Value

int

PARSER_IN_OLDSPROPS

Currently the RTF Old Properties.

public const int PARSER_IN_OLDSPROPS = 19

Field Value

int

PARSER_IN_OLDTPROPS

Currently the RTF Old Properties.

public const int PARSER_IN_OLDTPROPS = 18

Field Value

int

PARSER_IN_PARAGRAPH_GROUP_PROPERTIES

public const int PARSER_IN_PARAGRAPH_GROUP_PROPERTIES = 22

Field Value

int

PARSER_IN_PARAGRAPH_TABLE

Currently the RTF Paragraph group properties Table (word 2002)

public const int PARSER_IN_PARAGRAPH_TABLE = 14

Field Value

int

PARSER_IN_PICPROP

Currently a picprop control word is being parsed.

public const int PARSER_IN_PICPROP = 536870930

Field Value

int

PARSER_IN_PICT

Currently a pict control word is being parsed.

public const int PARSER_IN_PICT = 536870929

Field Value

int

PARSER_IN_PROT_USER_TABLE

Currently the RTF User Protection Information.

public const int PARSER_IN_PROT_USER_TABLE = 20

Field Value

int

PARSER_IN_REV_TABLE

Currently the RTF revtbl is being parsed.

public const int PARSER_IN_REV_TABLE = 10

Field Value

int

PARSER_IN_RSID_TABLE

Currently the RTF rsidtable is being parsed.

public const int PARSER_IN_RSID_TABLE = 11

Field Value

int

PARSER_IN_SHPPICT

Currently a shppict control word is being parsed.

public const int PARSER_IN_SHPPICT = 536870928

Field Value

int

PARSER_IN_STYLESHEET

Header state values

public const int PARSER_IN_STYLESHEET = 7

Field Value

int

PARSER_IN_UNKNOWN

Currently the parser is in an unknown state.

public const int PARSER_IN_UNKNOWN = -1879048193

Field Value

int

PARSER_IN_UPR

Bitmapping: 0111 1111 1111 1111 = Unkown state 0xxx xxxx xxxx xxxx = In Header 1xxx xxxx xxxx xxxx = In Document 2xxx xxxx xxxx xxxx = Reserved 4xxx xxxx xxxx xxxx = Other 8xxx xxxx xxxx xxxx = Errors

public const int PARSER_IN_UPR = 536870914

Field Value

int

PARSER_STARTSTOP

other states

public const int PARSER_STARTSTOP = 1073741825

Field Value

int

TOKENISER_BINARY

The RtfTokeniser is currently reading binary stream.

public const int TOKENISER_BINARY = 3

Field Value

int

TOKENISER_HEX

The RtfTokeniser is currently reading hex data.

public const int TOKENISER_HEX = 4

Field Value

int

TOKENISER_IGNORE_RESULT

The RtfTokeniser ignore result

public const int TOKENISER_IGNORE_RESULT = 5

Field Value

int

TOKENISER_NORMAL

The RtfTokeniser is in its ground state. Any token may follow.

public const int TOKENISER_NORMAL = 0

Field Value

int

TOKENISER_SKIP_BYTES

TOKENISE VARIABLES ///////////////////

public const int TOKENISER_SKIP_BYTES = 1

Field Value

int

TOKENISER_SKIP_GROUP

The RtfTokeniser is currently tokenising a control word.

public const int TOKENISER_SKIP_GROUP = 2

Field Value

int

TOKENISER_STATE_IN_ERROR

The RtfTokeniser is currently in error state

public const int TOKENISER_STATE_IN_ERROR = -2147483648

Field Value

int

TOKENISER_STATE_IN_UNKOWN

The RtfTokeniser is currently in an unkown state

public const int TOKENISER_STATE_IN_UNKOWN = -16777216

Field Value

int

TYPE_CONVERT

Conversion type is a conversion. This uses the document (not rtfDoc) to add all the elements making it a different supported documents depending on the writer used.

public const int TYPE_CONVERT = 2

Field Value

int

TYPE_IMPORT_FRAGMENT

Conversion type is an import of a partial file/fragment. Uses direct content to add everything.

public const int TYPE_IMPORT_FRAGMENT = 1

Field Value

int

TYPE_IMPORT_FULL

Conversion type is an import. Uses direct content to add everything. This is what the original import does.

public const int TYPE_IMPORT_FULL = 0

Field Value

int

TYPE_IMPORT_INTO_ELEMENT

Conversion type to import a document into an element. i.e. Chapter, Section, Table Cell, etc. @since 2.1.4

public const int TYPE_IMPORT_INTO_ELEMENT = 3

Field Value

int

TYPE_UNIDENTIFIED

Conversion type is unknown

public const int TYPE_UNIDENTIFIED = -1

Field Value

int

errAssertion

public const int errAssertion = -6

Field Value

int

errBadTable

public const int errBadTable = -5

Field Value

int

errCtrlWordNotFound

public const int errCtrlWordNotFound = -8

Field Value

int

errEndOfFile

public const int errEndOfFile = -7

Field Value

int

errInvalidHex

public const int errInvalidHex = -4

Field Value

int

errOK

RTF parser error codes

public const int errOK = 0

Field Value

int

errStackOverflow

public const int errStackOverflow = -2

Field Value

int

errStackUnderflow

public const int errStackUnderflow = -1

Field Value

int

errUnmatchedBrace

public const int errUnmatchedBrace = -3

Field Value

int

Methods

AddListener(IEventListener)

Adds a EventListener to the RtfCtrlWordMgr . the new EventListener.

public void AddListener(IEventListener listener)

Parameters

listener IEventListener

ConvertRtfDocument(Stream, Document)

Converts an RTF document to an iText document. Usage: Create a parser object and call this method with the input stream and the iText Document object The Reader to read the RTF file from. The iText document that the RTF file is to be added to. @throws IOException On I/O errors.

public void ConvertRtfDocument(Stream readerIn, Document doc)

Parameters

readerIn Stream
doc Document

GetConversionType()

Get the conversion type. The type of the conversion. Import or Convert.

public int GetConversionType()

Returns

int

GetCurrentDestination()

Get the current destination object.

public RtfDestination GetCurrentDestination()

Returns

RtfDestination

The current state destination

GetDestination(string)

Get a destination from the map @para destination The string destination.

public static RtfDestination GetDestination(string destination)

Parameters

destination string

Returns

RtfDestination

The destination object from the map

GetDocument()

Get the Document object. Returns the object rtfDoc.

public Document GetDocument()

Returns

Document

GetExtendedDestination()

Helper method to indicate if this control word was a * control word.

public bool GetExtendedDestination()

Returns

bool

true if it was a * control word, otherwise false

GetImportManager()

Get the RtfImportHeader object. Returns the object importHeader.

public RtfImportMgr GetImportManager()

Returns

RtfImportMgr

GetLevel()

Gets the current group level The current group level value.

public int GetLevel()

Returns

int

GetLogFile()

Get the logfile name.

public string GetLogFile()

Returns

string

the logFile

GetParserState()

Get the current state of the parser. The current state of the parser.

public int GetParserState()

Returns

int

GetRtfDocument()

Get the RTF Document object. Returns the object rtfDoc.

public RtfDocument GetRtfDocument()

Returns

RtfDocument

GetState()

Get the state of the parser. The current RtfParserState state object.

public RtfParserState GetState()

Returns

RtfParserState

GetTokeniserState()

Get the current state of the tokeniser.

public int GetTokeniserState()

Returns

int

The current state of the tokeniser.

HandleCharacter(int)

Handles text tokens. These are either handed on to the appropriate destination handler. The text token to handle.

public int HandleCharacter(int nextChar)

Parameters

nextChar int

Returns

int

errOK if ok, other if an error occurred.

HandleCloseGroup()

Handles close group tokens. (})

public int HandleCloseGroup()

Returns

int

errOK if ok, other if an error occurred.

HandleCtrlWord(RtfCtrlWordData)

Handles control word tokens. Depending on the current state a control word can lead to a state change. When parsing the actual document contents, certain tabled values are remapped. i.e. colors, fonts, styles, etc.

public int HandleCtrlWord(RtfCtrlWordData ctrlWordData)

Parameters

ctrlWordData RtfCtrlWordData

The control word to handle.

Returns

int

errOK if ok, other if an error occurred.

HandleOpenGroup()

Handles open group tokens. ({)

public int HandleOpenGroup()

Returns

int

errOK if ok, other if an error occurred.

ImportRtfDocument(Stream, RtfDocument)

READER *

public void ImportRtfDocument(Stream readerIn, RtfDocument rtfDoc)

Parameters

readerIn Stream
rtfDoc RtfDocument

ImportRtfDocumentIntoElement(IElement, Stream, RtfDocument)

Imports a complete RTF document into an Element, i.e. Chapter, section, Table Cell, etc. The Reader to read the RTF document from. The RtfDocument to add the imported document to. @throws IOException On I/O errors. @since 2.1.4

public void ImportRtfDocumentIntoElement(IElement elem, Stream readerIn, RtfDocument rtfDoc)

Parameters

elem IElement

The Element the document is to be imported into.

readerIn Stream
rtfDoc RtfDocument

ImportRtfFragment(Stream, RtfDocument, RtfImportMappings)

Imports an RTF fragment. The Reader to read the RTF fragment from. The RTF document to add the RTF fragment to. The RtfImportMappings defining font and color mappings for the fragment. @throws IOException On I/O errors.

public void ImportRtfFragment(Stream readerIn, RtfDocument rtfDoc, RtfImportMappings importMappings)

Parameters

readerIn Stream
rtfDoc RtfDocument
importMappings RtfImportMappings

Init_stats()

Initialize the statistics values.

protected void Init_stats()

IsConvert()

Helper method to determin if conversion is TYPE_CONVERT @see com.lowagie.text.rtf.direct.RtfParser#TYPE_CONVERT

public bool IsConvert()

Returns

bool

true if TYPE_CONVERT, otherwise false

IsImport()

Helper method to determin if conversion is TYPE_IMPORT_FULL or TYPE_IMPORT_FRAGMENT @see com.lowagie.text.rtf.direct.RtfParser#TYPE_IMPORT_FULL @see com.lowagie.text.rtf.direct.RtfParser#TYPE_IMPORT_FRAGMENT

public bool IsImport()

Returns

bool

true if TYPE_CONVERT, otherwise false

IsImportFragment()

Helper method to determin if conversion is TYPE_IMPORT_FRAGMENT @see com.lowagie.text.rtf.direct.RtfParser#TYPE_IMPORT_FRAGMENT

public bool IsImportFragment()

Returns

bool

true if TYPE_CONVERT, otherwise false

IsImportFull()

Helper method to determin if conversion is TYPE_IMPORT_FULL @see com.lowagie.text.rtf.direct.RtfParser#TYPE_IMPORT_FULL

public bool IsImportFull()

Returns

bool

true if TYPE_CONVERT, otherwise false

IsLogAppend()

public bool IsLogAppend()

Returns

bool

the logAppend

IsLogging()

Get flag indicating if logging is on or off.

public bool IsLogging()

Returns

bool

the logging

IsNewGroup()

Helper method to determine if this is a new group.

public bool IsNewGroup()

Returns

bool

true if this is a new group, otherwise it returns false.

OutputDebug(object, int, string)

public static void OutputDebug(object doc, int groupLevel, string str)

Parameters

doc object
groupLevel int
str string

RemoveListener(IEventListener)

listener methods

public void RemoveListener(IEventListener listener)

Parameters

listener IEventListener

SetCurrentDestination(string)

Set the current destination object for the current state.

public bool SetCurrentDestination(string destination)

Parameters

destination string

The destination value to set.

Returns

bool

SetExtendedDestination(bool)

Helper method to set the extended control word flag.

public bool SetExtendedDestination(bool value)

Parameters

value bool

Boolean to set the value to.

Returns

bool

isExtendedDestination.

SetLogAppend(bool)

public void SetLogAppend(bool logAppend)

Parameters

logAppend bool

the logAppend to set

SetLogFile(string)

Set the logFile name

public void SetLogFile(string logFile)

Parameters

logFile string

the logFile to set

SetLogFile(string, bool)

Set the logFile name

public void SetLogFile(string logFile, bool logAppend)

Parameters

logFile string

the logFile to set

logAppend bool

SetLogging(bool)

Set flag indicating if logging is on or off

public void SetLogging(bool logging)

Parameters

logging bool

true to turn on logging, false to turn off logging.

SetNewGroup(bool)

Helper method to set the new group flag

public bool SetNewGroup(bool value)

Parameters

value bool

The bool value to set the flag

Returns

bool

The value of newGroup

SetParserState(int)

DOCUMENT CONTROL METHODS Handles - handleOpenGroup: Open groups - '{' handleCloseGroup: Close groups - '}' handleCtrlWord: Ctrl Words - '...' handleCharacter: Characters - Plain Text, etc.

public int SetParserState(int newState)

Parameters

newState int

Returns

int

SetTokeniserSkipBytes(long)

Sets the number of bytes to skip and the state of the tokeniser. The numbere of bytes to skip in the file.

public void SetTokeniserSkipBytes(long numberOfBytesToSkip)

Parameters

numberOfBytesToSkip long

SetTokeniserState(int)

Set the current state of the tokeniser.

public int SetTokeniserState(int value)

Parameters

value int

The new state of the tokeniser.

Returns

int

The state of the tokeniser.

SetTokeniserStateBinary(int)

Sets the number of binary bytes. The number of binary bytes.

public void SetTokeniserStateBinary(int binaryCount)

Parameters

binaryCount int

SetTokeniserStateBinary(long)

Sets the number of binary bytes. The number of binary bytes.

public void SetTokeniserStateBinary(long binaryCount)

Parameters

binaryCount long

SetTokeniserStateNormal()

Set the tokeniser state to skip to the end of the group. Sets the state to TOKENISER_SKIP_GROUP and skipGroupLevel to the current group level.

public void SetTokeniserStateNormal()

SetTokeniserStateSkipGroup()

Set the tokeniser state to skip to the end of the group. Sets the state to TOKENISER_SKIP_GROUP and skipGroupLevel to the current group level.

public void SetTokeniserStateSkipGroup()

Tokenise()

Read through the input file and parse the data stream into tokens. @throws IOException on IO error.

public void Tokenise()