Abstract utility class for parsing binary files.

Provides a library of methods to quickly navigate and extract various data types (signed and unsigned integers, floating- and fixed-point numbers, strings, etc.) from the file.

File access is managed via a Zend_Pdf_FileParserDataSource object. This allows the same parser code to work with many different data sources: in-memory objects, filesystem files, etc.

package Zend_Pdf
subpackage FileParser
copyright Copyright (c) 2005-2015 Zend Technologies USA Inc. (http://www.zend.com)
license New BSD License

 Methods

Object constructor.

__construct(\Zend_Pdf_FileParserDataSource $dataSource) 

Verifies that the data source has been properly initialized.

Parameters

$dataSource

\Zend_Pdf_FileParserDataSource

Exceptions

\Zend_Pdf_Exception

Object destructor.

__destruct() 

Discards the data source object.

Returns the data source object representing the file being parsed.

getDataSource() : \Zend_Pdf_FileParserDataSource

Returns

\Zend_Pdf_FileParserDataSource

getOffset()

getOffset() 

getSize()

getSize() 

Returns true if the specified bit is set in the integer bitfield.

isBitSet(integer $bit, integer $bitField) : boolean

Parameters

$bit

integer

Bit number to test (i.e. - 0-31)

$bitField

integer

Returns

boolean

Returns true if the file has been successfully parsed.

isParsed() : boolean

Returns

boolean

Returns true if the file has passed a cursory validation check.

isScreened() : boolean

Returns

boolean

Convenience wrapper for the data source object's moveToOffset() method.

moveToOffset(integer $offset) 

Parameters

$offset

integer

Destination byte offset.

Exceptions

\Zend_Pdf_Exception

Reads and parses the complete binary file.

parse() 

Must set $this->_isParsed to true if successful.

Exceptions

\Zend_Pdf_Exception

Convenience wrapper for the data source object's readBytes() method.

readBytes(integer $byteCount) : string

Parameters

$byteCount

integer

Number of bytes to read.

Exceptions

\Zend_Pdf_Exception

Returns

string

Reads the signed fixed-point number from the binary file at the current byte offset.

readFixed(integer $mantissaBits, integer $fractionBits, integer $byteOrder = \Zend_Pdf_FileParser::BYTE_ORDER_BIG_ENDIAN) : float

Common fixed-point sizes are 2.14 and 16.16.

Advances the offset by the number of bytes read. Throws an exception if an error occurs.

Parameters

$mantissaBits

integer

Number of bits in the mantissa

$fractionBits

integer

Number of bits in the fraction

$byteOrder

integer

(optional) Big- or little-endian byte order. Use the BYTEORDER constants defined in {@link Zend_Pdf_FileParser}. If omitted, uses big-endian.

Exceptions

\Zend_Pdf_Exception

Returns

float

Reads the signed integer value from the binary file at the current byte offset.

readInt(integer $size, integer $byteOrder = \Zend_Pdf_FileParser::BYTE_ORDER_BIG_ENDIAN) : integer

Advances the offset by the number of bytes read. Throws an exception if an error occurs.

Parameters

$size

integer

Size of integer in bytes: 1-4

$byteOrder

integer

(optional) Big- or little-endian byte order. Use the BYTEORDER constants defined in {@link Zend_Pdf_FileParser}. If omitted, uses big-endian.

Exceptions

\Zend_Pdf_Exception

Returns

integer

Reads the Mac Roman-encoded string from the binary file at the current byte offset.

readStringMacRoman(integer $byteCount, string $characterSet = '') : string

You must supply the desired resulting character set.

Advances the offset by the number of bytes read. Throws an exception if an error occurs.

Parameters

$byteCount

integer

Number of bytes (characters) to return.

$characterSet

string

(optional) Desired resulting character set. You may use any character set supported by {@link iconv()}. If omitted, uses 'current locale'.

Exceptions

\Zend_Pdf_Exception

Returns

string

Reads the Pascal string from the binary file at the current byte offset.

readStringPascal(string $characterSet = '', integer $lengthBytes = 1) : string

The length of the Pascal string is determined by reading the length bytes which preceed the character data. You must supply the desired resulting character set.

Advances the offset by the number of bytes read. Throws an exception if an error occurs.

Parameters

$characterSet

string

(optional) Desired resulting character set. You may use any character set supported by {@link iconv()}. If omitted, uses 'current locale'.

$lengthBytes

integer

(optional) Number of bytes that make up the length. Default is 1.

Exceptions

\Zend_Pdf_Exception

Returns

string

Reads the Unicode UTF-16-encoded string from the binary file at the current byte offset.

readStringUTF16(integer $byteCount, integer $byteOrder = \Zend_Pdf_FileParser::BYTE_ORDER_BIG_ENDIAN, string $characterSet = '') : string

The byte order of the UTF-16 string must be specified. You must also supply the desired resulting character set.

Advances the offset by the number of bytes read. Throws an exception if an error occurs.

todo Consider changing $byteCount to a character count. They are not always equivalent (in the case of surrogates).
todo Make $byteOrder optional if there is a byte-order mark (BOM) in the string being extracted.

Parameters

$byteCount

integer

Number of bytes (characters * 2) to return.

$byteOrder

integer

(optional) Big- or little-endian byte order. Use the BYTEORDER constants defined in {@link Zend_Pdf_FileParser}. If omitted, uses big-endian.

$characterSet

string

(optional) Desired resulting character set. You may use any character set supported by {@link iconv()}. If omitted, uses 'current locale'.

Exceptions

\Zend_Pdf_Exception

Returns

string

Reads the unsigned integer value from the binary file at the current byte offset.

readUInt(integer $size, integer $byteOrder = \Zend_Pdf_FileParser::BYTE_ORDER_BIG_ENDIAN) : integer

Advances the offset by the number of bytes read. Throws an exception if an error occurs.

NOTE: If you ask for a 4-byte unsigned integer on a 32-bit machine, the resulting value WILL BE SIGNED because PHP uses signed integers internally for everything. To guarantee portability, be sure to use bitwise operators operators on large unsigned integers!

Parameters

$size

integer

Size of integer in bytes: 1-4

$byteOrder

integer

(optional) Big- or little-endian byte order. Use the BYTEORDER constants defined in {@link Zend_Pdf_FileParser}. If omitted, uses big-endian.

Exceptions

\Zend_Pdf_Exception

Returns

integer

Performs a cursory check to verify that the binary file is in the expected format. Intended to quickly weed out obviously bogus files.

screen() 

Must set $this->_isScreened to true if successful.

Exceptions

\Zend_Pdf_Exception

Convenience wrapper for the data source object's skipBytes() method.

skipBytes(integer $byteCount) 

Parameters

$byteCount

integer

Number of bytes to skip.

Exceptions

\Zend_Pdf_Exception

 Properties

 

Object representing the data source to be parsed.

$_dataSource : \Zend_Pdf_FileParserDataSource

Default

null
 

Flag indicating that the file has been sucessfully parsed.

$_isParsed : boolean

Default

false
 

Flag indicating that the file has passed a cursory validation check.

$_isScreened : boolean

Default

false

 Constants

 

Big-endian byte order (0x01 0x02 0x03 0x04).

BYTE_ORDER_BIG_ENDIAN = 1 
 

Little-endian byte order (0x04 0x03 0x02 0x01).

BYTE_ORDER_LITTLE_ENDIAN = 0