An Analyzer is used to analyze text.
It thus represents a policy for extracting index terms from text.
Note: Lucene Java implementation is oriented to streams. It provides effective work with a huge documents (more then 20Mb). But engine itself is not oriented such documents. Thus Zend_Search_Lucene analysis API works with data strings and sets (arrays).
category | Zend |
---|---|
package | Zend_Search_Lucene |
subpackage | Analysis |
copyright | Copyright (c) 2005-2015 Zend Technologies USA Inc. (http://www.zend.com) |
license | New BSD License |
getDefault() : \Zend_Search_Lucene_Analysis_Analyzer
\Zend_Search_Lucene_Analysis_Analyzer
nextToken() : \Zend_Search_Lucene_Analysis_Token | null
Tokens are returned in UTF-8 (internal Zend_Search_Lucene encoding)
\Zend_Search_Lucene_Analysis_Token
null
reset()
setDefault(\Zend_Search_Lucene_Analysis_Analyzer $analyzer)
setInput(string $data, $encoding = ''
)
string
tokenize(string $data, $encoding = ''
) : array
Tokens are returned in UTF-8 (internal Zend_Search_Lucene encoding)
string
array
$_encoding : string
''
$_input : string
null
$_defaultImpl : \Zend_Search_Lucene_Analysis_Analyzer