category Zend
package Zend_Search_Lucene
subpackage Index
copyright Copyright (c) 2005-2015 Zend Technologies USA Inc. (http://www.zend.com)
license New BSD License

 Methods

Zend_Search_Lucene_Index_SegmentInfo constructor

__construct(\Zend_Search_Lucene_Storage_Directory $directory, string $name, integer $docCount, integer $delGen = 0, array|null $docStoreOptions = null, boolean $hasSingleNormFile = false, boolean $isCompound = null

Parameters

$directory

\Zend_Search_Lucene_Storage_Directory

$name

string

$docCount

integer

$delGen

integer

$docStoreOptions

arraynull

$hasSingleNormFile

boolean

$isCompound

boolean

Close terms stream

closeTermsStream() 

Should be used for resources clean up if stream is not read up to the end

inherited_from \Zend_Search_Lucene_Index_TermsStream_Interface::closeTermsStream()

Get compound file length

compoundFileLength(string $extension) : integer

Parameters

$extension

string

Returns

integer

Returns the total number of documents in this segment (including deleted documents).

count() : integer

Returns

integer

Returns term in current position

currentTerm() : \Zend_Search_Lucene_Index_Term | null
inherited_from \Zend_Search_Lucene_Index_TermsStream_Interface::currentTerm()

Returns

\Zend_Search_Lucene_Index_Termnull

Returns an array of all term positions in the documents.

currentTermPositions() : array

Return array structure: array( docId => array( pos1, pos2, ...), ...)

Returns

array

Deletes a document from the index segment.

delete($id) 

$id is an internal document id

Parameters

$id

Returns actual deletions file generation number.

getDelGen() : integer

Returns

integer

Returns field info for specified field

getField(integer $fieldNum) : \Zend_Search_Lucene_Index_FieldInfo

Parameters

$fieldNum

integer

Returns

\Zend_Search_Lucene_Index_FieldInfo

Returns array of FieldInfo objects.

getFieldInfos() : array

Returns

array

Returns field index or -1 if field is not found

getFieldNum(string $fieldName) : integer

Parameters

$fieldName

string

Returns

integer

Returns array of fields.

getFields(boolean $indexed = false) : array

if $indexed parameter is true, then returns only indexed fields.

Parameters

$indexed

boolean

Returns

array

Return segment name

getName() : string

Returns

string

Scans terms dictionary and returns term info

getTermInfo(\Zend_Search_Lucene_Index_Term $term) : \Zend_Search_Lucene_Index_TermInfo

Parameters

$term

\Zend_Search_Lucene_Index_Term

Returns

\Zend_Search_Lucene_Index_TermInfo

Returns true if any documents have been deleted from this index segment.

hasDeletions() : boolean

Returns

boolean

Returns true if segment has single norms file.

hasSingleNormFile() : boolean

Returns

boolean

Returns true if segment is stored using compound segment file.

isCompound() : boolean

Returns

boolean

Checks, that document is deleted

isDeleted($id) : boolean

Parameters

$id

Returns

boolean

Scans terms dictionary and returns next term

nextTerm() : \Zend_Search_Lucene_Index_Term | null
inherited_from \Zend_Search_Lucene_Index_TermsStream_Interface::nextTerm()

Returns

\Zend_Search_Lucene_Index_Termnull

Returns normalization factor for specified documents

norm(integer $id, string $fieldName) : float

Parameters

$id

integer

$fieldName

string

Returns

float

Returns norm vector, encoded in a byte string

normVector(string $fieldName) : string

Parameters

$fieldName

string

Returns

string

Returns the total number of non-deleted documents in this segment.

numDocs() : integer

Returns

integer

Opens index file stoted within compound index file

openCompoundFile(string $extension, boolean $shareHandler = true) : \Zend_Search_Lucene_Storage_File

Parameters

$extension

string

$shareHandler

boolean

Exceptions

\Zend_Search_Lucene_Exception

Returns

\Zend_Search_Lucene_Storage_File

Reset terms stream

resetTermsStream() : integer

$startId - id for the fist document $compact - remove deleted documents

Returns start document id for the next segment

inherited_from \Zend_Search_Lucene_Index_TermsStream_Interface::resetTermsStream()

Exceptions

\Zend_Search_Lucene_Exception

Returns

integer

Skip terms stream up to the specified term preffix.

skipTo(\Zend_Search_Lucene_Index_Term $prefix) 

Prefix contains fully specified field info and portion of searched term

inherited_from \Zend_Search_Lucene_Index_TermsStream_Interface::skipTo()

Parameters

$prefix

\Zend_Search_Lucene_Index_Term

Exceptions

\Zend_Search_Lucene_Exception

Returns IDs of all the documents containing term.

termDocs(\Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null) : array

Parameters

$term

\Zend_Search_Lucene_Index_Term

$shift

integer

$docsFilter

\Zend_Search_Lucene_Index_DocsFilternull

Returns

array

Returns term freqs array.

termFreqs(\Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null) : \Zend_Search_Lucene_Index_TermInfo

Result array structure: array(docId => freq, ...)

Parameters

$term

\Zend_Search_Lucene_Index_Term

$shift

integer

$docsFilter

\Zend_Search_Lucene_Index_DocsFilternull

Returns

\Zend_Search_Lucene_Index_TermInfo

Returns term positions array.

termPositions(\Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null) : \Zend_Search_Lucene_Index_TermInfo

Result array structure: array(docId => array(pos1, pos2, ...), ...)

Parameters

$term

\Zend_Search_Lucene_Index_Term

$shift

integer

$docsFilter

\Zend_Search_Lucene_Index_DocsFilternull

Returns

\Zend_Search_Lucene_Index_TermInfo

_cleanUpTermInfoCache()

_cleanUpTermInfoCache() 

Returns number of deleted documents.

_deletedCount() : integer

Returns

integer

Detect latest delete generation

_detectLatestDelGen() : integer

Is actualy used from writeChanges() method or from the constructor if it's invoked from Index writer. In both cases index write lock is already obtained, so we shouldn't care about it

Returns

integer

Get field position in a fields dictionary

_getFieldPosition(integer $fieldNum) : integer

Parameters

$fieldNum

integer

Returns

integer

Load 2.1+ format detetions file

_load21DelFile() : mixed

Returns bitset or an array depending on bitset extension availability

Returns

mixed

Load detetions file

_loadDelFile() : mixed

Returns bitset or an array depending on bitset extension availability

Exceptions

\Zend_Search_Lucene_Exception

Returns

mixed

Load terms dictionary index

_loadDictionaryIndex() 

Exceptions

\Zend_Search_Lucene_Exception

Load normalizatin factors from an index file

_loadNorm(integer $fieldNum) 

Parameters

$fieldNum

integer

Exceptions

\Zend_Search_Lucene_Exception

Load pre-2.1 detetions file

_loadPre21DelFile() : mixed

Returns bitset or an array depending on bitset extension availability

Exceptions

\Zend_Search_Lucene_Exception

Returns

mixed

 Properties

 

Delete file generation number

$_delGen : integer

Default

-2 means autodetect latest delete generation -1 means 'there is no delete file' 0 means pre-2.1 format delete file X specifies used delete file

 

List of deleted documents.

$_deleted : mixed

Default

null

bitset if bitset extension is loaded or array otherwise.

 

$this->_deleted update flag

$_deletedDirty : boolean

Default

false
 

File system adapter.

$_directory : \Zend_Search_Lucene_Storage_Directory_Filesystem

Default

 

Number of docs in a segment

$_docCount : integer

Default

 

Map of the document IDs Used to get new docID after removing deleted documents.

$_docMap : array | null

Default

null

It's not very effective from memory usage point of view, but much more faster, then other methods

 

Segment fields. Array of Zend_Search_Lucene_Index_FieldInfo objects for this segment

$_fields : array

Default

 

Field positions in a dictionary.

$_fieldsDicPositions : array

Default

(Term dictionary contains filelds ordered by names)

 

Frequencies File object for stream like terms reading

$_frqFile : \Zend_Search_Lucene_Storage_File

Default

null
 

Actual offset of the .frq file data

$_frqFileOffset : integer

Default

 

Segment has single norms file

$_hasSingleNormFile : boolean

Default

If true then one .nrm file is used for all fields Otherwise .fN files are used

 

Segment index interval

$_indexInterval : integer

Default

 

Use compound segment file (*.cfs) to collect all other segment files (excluding .del files)

$_isCompound : boolean

Default

 

Last Term in a terms stream

$_lastTerm : \Zend_Search_Lucene_Index_Term

Default

null
 

Last TermInfo in a terms stream

$_lastTermInfo : \Zend_Search_Lucene_Index_TermInfo

Default

null
 

An array of all term positions in the documents.

$_lastTermPositions : array | null

Default

Array structure: array( docId => array( pos1, pos2, ...), ...)

Is set to null if term positions loading has to be skipped

 

Segment name

$_name : string

Default

 

Normalization factors.

$_norms : array

Default

array()

An array fieldName => normVector normVector is a binary string. Each byte corresponds to an indexed document in a segment and encodes normalization factor (float value, encoded by Zend_Search_Lucene_Search_Similarity::encodeNorm())

 

Positions File object for stream like terms reading

$_prxFile : \Zend_Search_Lucene_Storage_File

Default

null
 

Actual offset of the .prx file in the compound file

$_prxFileOffset : integer

Default

 

Associative array where the key is the file name and the value is file size (.csf).

$_segFileSizes : array

Default

 

Associative array where the key is the file name and the value is data offset in a compound segment file (.csf).

$_segFiles : array

Default

 

$_sharedDocStoreOptions

$_sharedDocStoreOptions 

Default

 

Segment skip interval

$_skipInterval : integer

Default

 

Actual number of terms in term stream

$_termCount : integer

Default

0
 

Term Dictionary Index

$_termDictionary : array

Default

Array of arrays (Zend_Search_Lucene_Index_Term objects are represented as arrays because of performance considerations) [0] -> $termValue [1] -> $termFieldNum

Corresponding Zend_Search_Lucene_Index_TermInfo object stored in the $_termDictionaryInfos

 

Term Dictionary Index TermInfos

$_termDictionaryInfos : array

Default

Array of arrays (Zend_Search_Lucene_Index_TermInfo objects are represented as arrays because of performance considerations) [0] -> $docFreq [1] -> $freqPointer [2] -> $proxPointer [3] -> $skipOffset [4] -> $indexPointer

 

TermInfo cache

$_termInfoCache : array

Default

array()

Size is 1024. Numbers are used instead of class constants because of performance considerations

 

Overall number of terms in term stream

$_termNum : integer

Default

0
 

Terms scan mode

$_termsScanMode : integer

Default

Values:

self::SM_TERMS_ONLY - terms are scanned, no additional info is retrieved self::SM_FULL_INFO - terms are scanned, frequency and position info is retrieved self::SM_MERGE_INFO - terms are scanned, frequency and position info is retrieved document numbers are compacted (shifted if segment has deleted documents)

 

Term Dictionary File object for stream like terms reading

$_tisFile : \Zend_Search_Lucene_Storage_File

Default

null
 

Actual offset of the .tis file data

$_tisFileOffset : integer

Default

 

True if segment uses shared doc store

$_usesSharedDocStore : boolean

Default

 Constants

 

"Full scan vs fetch" boundary.

FULL_SCAN_VS_FETCH_BOUNDARY = 5 

If filter selectivity is less than this value, then full scan is performed (since term entries fetching has some additional overhead).

 

SM_FULL_INFO

SM_FULL_INFO = 1 
 

SM_MERGE_INFO

SM_MERGE_INFO = 2 
 

Scan modes

SM_TERMS_ONLY = 0