Personal tools
You are here: Home CinnamonServer Components and Concepts Lucene Search in Cinnamon How to add a new IndexItem

How to add a new IndexItem

Tutorial how to add a new IndexItem for indexing content with the Lucene search engine.

Note: currently, indexing and searching is centered around XML content. This tutorial assumes that you are going to index well-formed XML or XHTML files which are parsable with an XML parser. If you need to index other content with Cinnamon (word documents, PDF, FrameMaker, ...) please contact us.

Example document (Source: DMC-S1000DBIKE-AAA-DA5-00-00-00AA-041A-A_006-00_EN-US.xml from s1000d.org)

<dmodule>
...
<description>
<levelledPara>
<title>Gears</title>
<para>The gears include the mechanism, the hubs and the shifters.</para>
...
<para>The bicycles of these days can have 27 gears or more.
 The mountain bikes use a set that includes: <randomList listItemPrefix="pf02">
<para>Three socket sprockets of different dimension on the front</para>
</listItem>
<listItem>
<para>Nine socket sprockets of different dimensions at the rear</para>
</listItem>
</randomList>
</para>
...
</levelledPara>
</description>
...
</dmodule>

Create a new IndexItem in the AdminTool

 

Create index item image

 

Select a name

This is an internal name and useful to give you or your successor an idea what this IndexItem is about.

Select a field name

This is the label under which Lucene will store the individual strings which we index. You (or rather, your client) will need this for the search requests. You may define several field names to help create semantic search features: if all words from <title> elements are indexed under the field name "title" instead of being lumped together under "content", you may later on search for words which only occur in title tags.

Determine if the IndexItem indexes content, custom metadata or system metadata

Most of the time, you are probably going to just set "For content" to active, indicating that this index item is only used for indexing of content. But some times it may be appropriate to index the other fields as well. For example, if you have a documents name (which is in /sysMeta/object/name) but it may also appear in the custom metadata (for example, in a list of documents referenced there) and lastly also in the content as an element //name, you can tell Cinnamon to index all three kinds of data.

Choose an IndexGroup

This depends on your client - usually, the default group is ok.

Choose an indexer

To keep things simple, we choose the DefaultIndexer (as this example does not show any child-elements of para which need indexing).

Do you want to index multiple elements?

If your XPath search_string (the next field) is going to return a list of xml nodes (for example //para), instead of just one element (like <title>), check this field.

Identify any conditions on the document that have to be met

We only want to index objects of type "dmodule, so the XPath for the search condition is going to be

 string(/sysMeta/object/objectType/sysName) = 'dmodule'

 

Identify the element you want to index

In this case, we want to have index the whole text in the description section, but not the title element.  This can be achieved by indexing all <para> elements and their descendants. The XPath for the search string is: /dmodule/description/*/para

 

As your new IndexItem is not systemic (that is, it is not an item provided by default by the Cinnamon server), leave the next field unchecked. Also, you should leave the Value Assistence Provider Params as they are, unless you know what you are doing.

 

Update your index

After a new IndexItem has been created, your whole index needs to be updated. Depending on how many documents you have, this may take some time. To start the index process, you need direct database access (at the moment) and should set the column objects.index_ok to NULL . Also, make sure that in your cinnamon_config.xml the parameter to start the index server process are set correctly:

<startIndexServer>true</startIndexServer>

 

 

Document Actions