Standard indexer classes
Index classes provided by a standard install of the Cinnamon CMS
Indexer classes
The following are the default indexer classes- DefaultIndexer
- The DefaultIndexer takes the string value of an XML node and stores it after using the StandardAnalyzer from Lucene. The string is treated as text and is tokenized along word boundaries (which may lead to unexpected results if you index serial numbers or product names which use '_' or '.' and other problematic characters). If you got a normal text you wish to search for individual words later, use this one.
- IntegerXPathIndexer
- This indexer class is used to convert whole numbers into formatted strings. Formatted is the keyword here: if you wish to search for a range of values, for example all documents with a length between 1000 and 2000 Bytes, this only works if the values are stored in a compatible format which allows for lexical sorting. The problem with lexical sorting is: 1,2,20,4 would be a valid range for 1..4. To solve this problem, all numbers are converted to a 20 character string which is padded with 0 from the left. (20 characters should be enough, unless you are documenting the national debt).
- CompleteStringIndexer
- Sometimes it is important to search for an exact string which contains special non-word characters and numbers. For example, you got an account number of "A123 154 456 ZZ" and need to be able to search for all items beginning with "A123 15" and ending with "Z" - an indexer which indexes A123 and ZZ separately would not help you here. And the cost of a "A123 near Z" search would be prohibitive. In moments like this, were lesser search engines would only ask "did you mean A320?", you can use the CompleteStringIndexer which stores the string value of an XML node in its original form.
- DateTimeIndexer
- Indexes date/time strings in the string form "YYYY-MM-DDThh:mm:ss", for example 2000-02-23T19:55:22. Internally, it converts the value into milliseconds and stores them in the index.
- DateXPathInxdexer
- Stores only the date part of a string which is formatted like "YYYY-MM-DDThh:mm:ss" in the index.
- TimeIndexer
- Stores only the time part of a string which is formatted like "YYYY-MM-DDThh:mm:ss" in the index.
- DecimalXPathIndexer
- Converts decimal numbers like 3.14 into 20 character strings and indexes them.
Reverse indexing
Reverse indexing stores a string by reversing it's characters and running them through the Lucene Analyzer (for ReverseStringIndexer) or storing them directly (for the CompleteStringIndexer variants). The client uses these to implement fast "endsWith" searches, so if your client programm provides the user with wildcard searches this may be a useful performance optimization.
- ReverseStringIndexer
- ReverseCompleteStringIndexer
- DescendingReverseStringIndexer
- DescendingReverseCompleteStringIndexer
Descending indexers
When indexing XML and HTML nodes, often it is not enough to get the content of an element like <title>Engine part list</title>, because you also want to index the content of the child nodes (like in <p>Press the <em>red</em> button</p>) without the need to configure an index item for each element that may occur.
The descending indexers will retrieve the string value of every child element of a given node and add them (separated by a space) to the indexed string. In the example above, the result would be "Press the red button" instead of just indexing "Press the button" if you where to use the DefaultIndexer.
- DescendingStringIndexer
- DescendingCompleteStringIndexer
- DescendingReverseStringIndexer
- DescendingReverseCompleteStringIndexer

