Class RtfParser
- Namespace
- iTextSharp.text.rtf.parser
- Assembly
- iTextSharp.LGPLv2.Core.dll
The RtfParser allows the importing of RTF documents or RTF document fragments. The RTF document or fragment is tokenised, font and color definitions corrected and then added to the document being written. @author Mark Hall (Mark.Hall@mail.room3b.eu) @author Howard Shank (hgshank@yahoo.com) @since 2.0.8
public class RtfParser
- Inheritance
-
RtfParser
- Inherited Members
Constructors
RtfParser(Document)
Constructor @since 2.1.3
public RtfParser(Document doc)
Parameters
doc
Document
Fields
DESTINATION_NORMAL
Destination is normal. Text is processed.
public const int DESTINATION_NORMAL = 0
Field Value
DESTINATION_SKIP
Destination is skipping. Text is ignored.
public const int DESTINATION_SKIP = 1
Field Value
PARSER_ERROR
ERRORS
public const int PARSER_ERROR = -2147483648
Field Value
PARSER_ERROR_EOF
The parser reached the end of the file.
public const int PARSER_ERROR_EOF = -2147483647
Field Value
PARSER_IN_BLIPUID
Currently a blipuid control word is being parsed.
public const int PARSER_IN_BLIPUID = 536870931
Field Value
PARSER_IN_CHARSET
Currently the RTF charset is being parsed.
public const int PARSER_IN_CHARSET = 1
Field Value
PARSER_IN_COLOR_TABLE
Currently the RTF color table is being parsed.
public const int PARSER_IN_COLOR_TABLE = 6
Field Value
PARSER_IN_DEFFONT
Currently the RTF deffont is being parsed.
public const int PARSER_IN_DEFFONT = 2
Field Value
PARSER_IN_DOCUMENT
Currently the RTF document content is being parsed.
public const int PARSER_IN_DOCUMENT = 536870912
Field Value
PARSER_IN_FILE_TABLE
Currently the RTF filetbl is being parsed.
public const int PARSER_IN_FILE_TABLE = 5
Field Value
PARSER_IN_FONT_TABLE
Currently the RTF font table is being parsed.
public const int PARSER_IN_FONT_TABLE = 3
Field Value
PARSER_IN_FONT_TABLE_INFO
Currently a RTF font table info element is being parsed.
public const int PARSER_IN_FONT_TABLE_INFO = 4
Field Value
PARSER_IN_GENERATOR
Currently the RTF generator is being parsed.
public const int PARSER_IN_GENERATOR = 12
Field Value
PARSER_IN_HEADER
Currently the RTF document header is being parsed.
public const int PARSER_IN_HEADER = 0
Field Value
PARSER_IN_INFO_GROUP
Document state values
public const int PARSER_IN_INFO_GROUP = 536870913
Field Value
PARSER_IN_LATENTSTYLES
Currently the Latent Style and Formatting usage restrictions
public const int PARSER_IN_LATENTSTYLES = 21
Field Value
PARSER_IN_LISTOVERRIDE_TABLE
Currently the RTF listtable override is being parsed.
public const int PARSER_IN_LISTOVERRIDE_TABLE = 9
Field Value
PARSER_IN_LIST_TABLE
Currently the RTF listtables is being parsed.
public const int PARSER_IN_LIST_TABLE = 8
Field Value
PARSER_IN_OLDCPROPS
Currently the RTF Old Properties.
public const int PARSER_IN_OLDCPROPS = 15
Field Value
PARSER_IN_OLDPPROPS
Currently the RTF Old Properties.
public const int PARSER_IN_OLDPPROPS = 16
Field Value
PARSER_IN_OLDSPROPS
Currently the RTF Old Properties.
public const int PARSER_IN_OLDSPROPS = 19
Field Value
PARSER_IN_OLDTPROPS
Currently the RTF Old Properties.
public const int PARSER_IN_OLDTPROPS = 18
Field Value
PARSER_IN_PARAGRAPH_GROUP_PROPERTIES
public const int PARSER_IN_PARAGRAPH_GROUP_PROPERTIES = 22
Field Value
PARSER_IN_PARAGRAPH_TABLE
Currently the RTF Paragraph group properties Table (word 2002)
public const int PARSER_IN_PARAGRAPH_TABLE = 14
Field Value
PARSER_IN_PICPROP
Currently a picprop control word is being parsed.
public const int PARSER_IN_PICPROP = 536870930
Field Value
PARSER_IN_PICT
Currently a pict control word is being parsed.
public const int PARSER_IN_PICT = 536870929
Field Value
PARSER_IN_PROT_USER_TABLE
Currently the RTF User Protection Information.
public const int PARSER_IN_PROT_USER_TABLE = 20
Field Value
PARSER_IN_REV_TABLE
Currently the RTF revtbl is being parsed.
public const int PARSER_IN_REV_TABLE = 10
Field Value
PARSER_IN_RSID_TABLE
Currently the RTF rsidtable is being parsed.
public const int PARSER_IN_RSID_TABLE = 11
Field Value
PARSER_IN_SHPPICT
Currently a shppict control word is being parsed.
public const int PARSER_IN_SHPPICT = 536870928
Field Value
PARSER_IN_STYLESHEET
Header state values
public const int PARSER_IN_STYLESHEET = 7
Field Value
PARSER_IN_UNKNOWN
Currently the parser is in an unknown state.
public const int PARSER_IN_UNKNOWN = -1879048193
Field Value
PARSER_IN_UPR
Bitmapping: 0111 1111 1111 1111 = Unkown state 0xxx xxxx xxxx xxxx = In Header 1xxx xxxx xxxx xxxx = In Document 2xxx xxxx xxxx xxxx = Reserved 4xxx xxxx xxxx xxxx = Other 8xxx xxxx xxxx xxxx = Errors
public const int PARSER_IN_UPR = 536870914
Field Value
PARSER_STARTSTOP
other states
public const int PARSER_STARTSTOP = 1073741825
Field Value
TOKENISER_BINARY
The RtfTokeniser is currently reading binary stream.
public const int TOKENISER_BINARY = 3
Field Value
TOKENISER_HEX
The RtfTokeniser is currently reading hex data.
public const int TOKENISER_HEX = 4
Field Value
TOKENISER_IGNORE_RESULT
The RtfTokeniser ignore result
public const int TOKENISER_IGNORE_RESULT = 5
Field Value
TOKENISER_NORMAL
The RtfTokeniser is in its ground state. Any token may follow.
public const int TOKENISER_NORMAL = 0
Field Value
TOKENISER_SKIP_BYTES
TOKENISE VARIABLES ///////////////////
public const int TOKENISER_SKIP_BYTES = 1
Field Value
TOKENISER_SKIP_GROUP
The RtfTokeniser is currently tokenising a control word.
public const int TOKENISER_SKIP_GROUP = 2
Field Value
TOKENISER_STATE_IN_ERROR
The RtfTokeniser is currently in error state
public const int TOKENISER_STATE_IN_ERROR = -2147483648
Field Value
TOKENISER_STATE_IN_UNKOWN
The RtfTokeniser is currently in an unkown state
public const int TOKENISER_STATE_IN_UNKOWN = -16777216
Field Value
TYPE_CONVERT
Conversion type is a conversion. This uses the document (not rtfDoc) to add all the elements making it a different supported documents depending on the writer used.
public const int TYPE_CONVERT = 2
Field Value
TYPE_IMPORT_FRAGMENT
Conversion type is an import of a partial file/fragment. Uses direct content to add everything.
public const int TYPE_IMPORT_FRAGMENT = 1
Field Value
TYPE_IMPORT_FULL
Conversion type is an import. Uses direct content to add everything. This is what the original import does.
public const int TYPE_IMPORT_FULL = 0
Field Value
TYPE_IMPORT_INTO_ELEMENT
Conversion type to import a document into an element. i.e. Chapter, Section, Table Cell, etc. @since 2.1.4
public const int TYPE_IMPORT_INTO_ELEMENT = 3
Field Value
TYPE_UNIDENTIFIED
Conversion type is unknown
public const int TYPE_UNIDENTIFIED = -1
Field Value
errAssertion
public const int errAssertion = -6
Field Value
errBadTable
public const int errBadTable = -5
Field Value
errCtrlWordNotFound
public const int errCtrlWordNotFound = -8
Field Value
errEndOfFile
public const int errEndOfFile = -7
Field Value
errInvalidHex
public const int errInvalidHex = -4
Field Value
errOK
RTF parser error codes
public const int errOK = 0
Field Value
errStackOverflow
public const int errStackOverflow = -2
Field Value
errStackUnderflow
public const int errStackUnderflow = -1
Field Value
errUnmatchedBrace
public const int errUnmatchedBrace = -3
Field Value
Methods
AddListener(IEventListener)
Adds a EventListener to the RtfCtrlWordMgr . the new EventListener.
public void AddListener(IEventListener listener)
Parameters
listener
IEventListener
ConvertRtfDocument(Stream, Document)
Converts an RTF document to an iText document. Usage: Create a parser object and call this method with the input stream and the iText Document object The Reader to read the RTF file from. The iText document that the RTF file is to be added to. @throws IOException On I/O errors.
public void ConvertRtfDocument(Stream readerIn, Document doc)
Parameters
GetConversionType()
Get the conversion type. The type of the conversion. Import or Convert.
public int GetConversionType()
Returns
GetCurrentDestination()
Get the current destination object.
public RtfDestination GetCurrentDestination()
Returns
- RtfDestination
The current state destination
GetDestination(string)
Get a destination from the map @para destination The string destination.
public static RtfDestination GetDestination(string destination)
Parameters
destination
string
Returns
- RtfDestination
The destination object from the map
GetDocument()
Get the Document object. Returns the object rtfDoc.
public Document GetDocument()
Returns
GetExtendedDestination()
Helper method to indicate if this control word was a * control word.
public bool GetExtendedDestination()
Returns
- bool
true if it was a * control word, otherwise false
GetImportManager()
Get the RtfImportHeader object. Returns the object importHeader.
public RtfImportMgr GetImportManager()
Returns
GetLevel()
Gets the current group level The current group level value.
public int GetLevel()
Returns
GetLogFile()
Get the logfile name.
public string GetLogFile()
Returns
- string
the logFile
GetParserState()
Get the current state of the parser. The current state of the parser.
public int GetParserState()
Returns
GetRtfDocument()
Get the RTF Document object. Returns the object rtfDoc.
public RtfDocument GetRtfDocument()
Returns
GetState()
Get the state of the parser. The current RtfParserState state object.
public RtfParserState GetState()
Returns
GetTokeniserState()
Get the current state of the tokeniser.
public int GetTokeniserState()
Returns
- int
The current state of the tokeniser.
HandleCharacter(int)
Handles text tokens. These are either handed on to the appropriate destination handler. The text token to handle.
public int HandleCharacter(int nextChar)
Parameters
nextChar
int
Returns
- int
errOK if ok, other if an error occurred.
HandleCloseGroup()
Handles close group tokens. (})
public int HandleCloseGroup()
Returns
- int
errOK if ok, other if an error occurred.
HandleCtrlWord(RtfCtrlWordData)
Handles control word tokens. Depending on the current state a control word can lead to a state change. When parsing the actual document contents, certain tabled values are remapped. i.e. colors, fonts, styles, etc.
public int HandleCtrlWord(RtfCtrlWordData ctrlWordData)
Parameters
ctrlWordData
RtfCtrlWordDataThe control word to handle.
Returns
- int
errOK if ok, other if an error occurred.
HandleOpenGroup()
Handles open group tokens. ({)
public int HandleOpenGroup()
Returns
- int
errOK if ok, other if an error occurred.
ImportRtfDocument(Stream, RtfDocument)
READER *
public void ImportRtfDocument(Stream readerIn, RtfDocument rtfDoc)
Parameters
readerIn
StreamrtfDoc
RtfDocument
ImportRtfDocumentIntoElement(IElement, Stream, RtfDocument)
Imports a complete RTF document into an Element, i.e. Chapter, section, Table Cell, etc. The Reader to read the RTF document from. The RtfDocument to add the imported document to. @throws IOException On I/O errors. @since 2.1.4
public void ImportRtfDocumentIntoElement(IElement elem, Stream readerIn, RtfDocument rtfDoc)
Parameters
elem
IElementThe Element the document is to be imported into.
readerIn
StreamrtfDoc
RtfDocument
ImportRtfFragment(Stream, RtfDocument, RtfImportMappings)
Imports an RTF fragment. The Reader to read the RTF fragment from. The RTF document to add the RTF fragment to. The RtfImportMappings defining font and color mappings for the fragment. @throws IOException On I/O errors.
public void ImportRtfFragment(Stream readerIn, RtfDocument rtfDoc, RtfImportMappings importMappings)
Parameters
readerIn
StreamrtfDoc
RtfDocumentimportMappings
RtfImportMappings
Init_stats()
Initialize the statistics values.
protected void Init_stats()
IsConvert()
Helper method to determin if conversion is TYPE_CONVERT @see com.lowagie.text.rtf.direct.RtfParser#TYPE_CONVERT
public bool IsConvert()
Returns
- bool
true if TYPE_CONVERT, otherwise false
IsImport()
Helper method to determin if conversion is TYPE_IMPORT_FULL or TYPE_IMPORT_FRAGMENT @see com.lowagie.text.rtf.direct.RtfParser#TYPE_IMPORT_FULL @see com.lowagie.text.rtf.direct.RtfParser#TYPE_IMPORT_FRAGMENT
public bool IsImport()
Returns
- bool
true if TYPE_CONVERT, otherwise false
IsImportFragment()
Helper method to determin if conversion is TYPE_IMPORT_FRAGMENT @see com.lowagie.text.rtf.direct.RtfParser#TYPE_IMPORT_FRAGMENT
public bool IsImportFragment()
Returns
- bool
true if TYPE_CONVERT, otherwise false
IsImportFull()
Helper method to determin if conversion is TYPE_IMPORT_FULL @see com.lowagie.text.rtf.direct.RtfParser#TYPE_IMPORT_FULL
public bool IsImportFull()
Returns
- bool
true if TYPE_CONVERT, otherwise false
IsLogAppend()
public bool IsLogAppend()
Returns
- bool
the logAppend
IsLogging()
Get flag indicating if logging is on or off.
public bool IsLogging()
Returns
- bool
the logging
IsNewGroup()
Helper method to determine if this is a new group.
public bool IsNewGroup()
Returns
- bool
true if this is a new group, otherwise it returns false.
OutputDebug(object, int, string)
public static void OutputDebug(object doc, int groupLevel, string str)
Parameters
RemoveListener(IEventListener)
listener methods
public void RemoveListener(IEventListener listener)
Parameters
listener
IEventListener
SetCurrentDestination(string)
Set the current destination object for the current state.
public bool SetCurrentDestination(string destination)
Parameters
destination
stringThe destination value to set.
Returns
SetExtendedDestination(bool)
Helper method to set the extended control word flag.
public bool SetExtendedDestination(bool value)
Parameters
value
boolBoolean to set the value to.
Returns
- bool
isExtendedDestination.
SetLogAppend(bool)
public void SetLogAppend(bool logAppend)
Parameters
logAppend
boolthe logAppend to set
SetLogFile(string)
Set the logFile name
public void SetLogFile(string logFile)
Parameters
logFile
stringthe logFile to set
SetLogFile(string, bool)
Set the logFile name
public void SetLogFile(string logFile, bool logAppend)
Parameters
SetLogging(bool)
Set flag indicating if logging is on or off
public void SetLogging(bool logging)
Parameters
logging
booltrue to turn on logging, false to turn off logging.
SetNewGroup(bool)
Helper method to set the new group flag
public bool SetNewGroup(bool value)
Parameters
value
boolThe bool value to set the flag
Returns
- bool
The value of newGroup
SetParserState(int)
DOCUMENT CONTROL METHODS Handles - handleOpenGroup: Open groups - '{' handleCloseGroup: Close groups - '}' handleCtrlWord: Ctrl Words - '...' handleCharacter: Characters - Plain Text, etc.
public int SetParserState(int newState)
Parameters
newState
int
Returns
SetTokeniserSkipBytes(long)
Sets the number of bytes to skip and the state of the tokeniser. The numbere of bytes to skip in the file.
public void SetTokeniserSkipBytes(long numberOfBytesToSkip)
Parameters
numberOfBytesToSkip
long
SetTokeniserState(int)
Set the current state of the tokeniser.
public int SetTokeniserState(int value)
Parameters
value
intThe new state of the tokeniser.
Returns
- int
The state of the tokeniser.
SetTokeniserStateBinary(int)
Sets the number of binary bytes. The number of binary bytes.
public void SetTokeniserStateBinary(int binaryCount)
Parameters
binaryCount
int
SetTokeniserStateBinary(long)
Sets the number of binary bytes. The number of binary bytes.
public void SetTokeniserStateBinary(long binaryCount)
Parameters
binaryCount
long
SetTokeniserStateNormal()
Set the tokeniser state to skip to the end of the group. Sets the state to TOKENISER_SKIP_GROUP and skipGroupLevel to the current group level.
public void SetTokeniserStateNormal()
SetTokeniserStateSkipGroup()
Set the tokeniser state to skip to the end of the group. Sets the state to TOKENISER_SKIP_GROUP and skipGroupLevel to the current group level.
public void SetTokeniserStateSkipGroup()
Tokenise()
Read through the input file and parse the data stream into tokens. @throws IOException on IO error.
public void Tokenise()