Lucene Index in AEM - Part 2

This post illustrates about full text search scenario and steps to create custom Lucene Full Text Index.

Lucene Index in AEM

On a high level, for full text search, we need to index all nodes and properties (these are the two major means in which our content is held in the repository) 

For property:
  • In order to make a specific property to be indexed for both full text and property constraint scenario, then in the property definition of respective indexRule, we need to add below property for
  • Full text:
    • NameTypeValue
      nodeScopeIndexBooleantrue
  • For property constraint: (name and isRegexExp are interrelated as already explained in previous post)
    • NameTypeValue
      propertyIndexBooleantrue
      nameStringpropertyname or regex pattern
      isRegexExpBooleanfalse or true
  • Example : jcr:title property of a page might be used in queries with property constraints or we might need to get the results of full text/contains queries based on the jcr:title or both. 
For node:
  • indexRules specific to a nodeType together with aggregates definition for its related child nodes to be indexed implies the nodes to be indexed as part of Lucene Full text Index.
  • Example : 
    • For a full text functionality to display all pages and assets as part of specific project say, xyx. Then we can create custom Lucene Full Text Index targeting project specific paths, related node types - cq:Page and dam:Asset (in indexRules and aggregates defintion) and hence its properties.
    • If we are not creating custom + if we use "type" predicate as "cq:Page" and "dam:Asset" for our fulltext query(group query to bring in both "type") -> OOB cqPageLucene and damAssetLucene will be considered based on its definition.
OOB Lucene Full Text Index:(/oak:index/lucene)
  • We are aware that nt:base is the super type of all available node types like nt:unstructured, cq:Page, dam:Asset, dam:AssetContent and so on (for any node type for that matter)
  • Given this understanding, in our local instance (without any custom index definition), execute a query with "fulltext" predicate -> "/oak:index/lucene" (OOB Lucene Full Text Index) will be used.
  • If we observe this index, we have indexRules configured for the nodeType - nt:base with whole list of properties defined under that. 
  • Except for last three properties, rest all would have property named "index" -> false (screenshot below for reference)

  • lucene full text index

  • Key to note here is the last property (highlighted in below screenshot) which will index all the properties of node nt:base (which in turn implies all nodes as nt:base is the super type of all available node types)
  • For this reason, for a full text query without "type" predicate, this index(/oak:index/lucene) will be used.
  • Also, on the nt:base node, we have a property named "includePropertyTypes" with value being String and Boolean. It indicates that it can index properties which are of type String and Boolean data type.

  • Lucene Full Text Index
With this understanding on how nodes and properties can be indexed as part of Lucene Full text Index, we will create a custom Lucene Full Text index with sample use case. 

Use case: 
Get all the assets which are related to "biking" in we-retail assets. 
Will divide into two scenarios - > 
  • Full text without "type" predicate (to show case the explanation above related to OOB Lucene full text index - /oak:index/lucene)
  • Full text with desired "type" predicate. 
Without "type" predicate:
  • path=/content/dam/we-retail
  • fulltext=biking
  • p.limit=-1
Video demo:

With "type" predicate:
  • path=/content/dam/we-retail
  • type=dam:Asset
  • fulltext=biking
  • p.limit=-1
Video demo: 

For the sake of creating custom Lucene Full text Index for a new node type, have amended the query as below with type as "dam:AssetContent" (In previous post, we have created damAssetContentLucene for illustrating Lucene Property Index, will be amending on top of it for fulltext query)
  • path=/content/dam/we-retail
  • type=dam:AssetContent
  • fulltext=biking
  • p.limit=-1
Video demo:

Comments

Popular posts from this blog

Embedding Third party dependency/OSGi bundle in AEM application hosted in AEMasCS

Embed Third party dependency using bnd-maven-plugin

OSGI Factory Configuration implementation