ElasticSearch in Pega: How and when to use it
As you may have heard ElasticSearch is now available for use in Pega – versions 7.1.7 and up. From my experiences building an ElasticSearch-based functionality for a client, I can tell you it works like a charm… Once you get it to work. It makes for a powerful search mechanism and offers more freedom than the traditional “Obj”-method based Pega search. Moreover, the search is lightweight compared to search using Obj-Open due to the more extensive indexing. Nonetheless, as with all things programming, there are certain trade-offs. In this article I will first discuss what usages ElasticSearch lends itself to, then address its drawbacks, and finally offer a 7-step plan for implementing it. Please note this article is based on experiences implementing ElasticSearch in Pega 7.2.0 and 7.2.1.
To give you some concrete examples: say you’re working on an application that deals with customer data of an insurance company. If your use case is to retrieve all active healthcare insurance policies distributed in a certain geographical region, you will be better suited using an Obj-Browse/Obj-Open, Report Definition or any of the other more traditional Pega features. If, on the other hand, your use case is to retrieve a certain insurance policy based on customer name, policy number, combination of a date range and a zip code or any other search term that is conducive to a modest amount of search results, ElasticSearch could significantly help you out.
Before you start developing away on your search functionality, awareness of ElasticSearch’s limitations will help you decide whether this is an appropriate use case for ElasticSearch or whether you might be better off choosing another solution. First of all, ElasticSearch only returns a maximum of 1000 results. This is confirmed by a Pega Engineer answering a question on PDN, and I have also encountered this in practice. This is something that is out of your hands. If your use case is to offer a search functionality that will return results based on a specific and narrowly defined search, this limit should not affect you. If your use case could translate to a more general database query that would regularly exceed 1000 results, ElasticSearch might not do.
Moreover, you have no way to affect the sorting of the results. Whereas using the Obj-methods in Pega, you can specify sorting on these methods, you do not have this option in ElasticSearch. The “pxRetrieveSearchData” OOTB activity, containing the ElasticSearch mechanism is like a black box. You cannot make it return the results handily sorted in a specific way. Of course you could engineer something to sort it afterward, but that might prove complex. Additionally, if your search query would return more than 1000 results, the lack of control over sorting could mean you do not return the most relevant results.
To give you an example: say one of the key-value pairs you have specified as a possible search criterion is a date type. A user searches for Work items, using this search criterion, within a date range of 3 days – ElasticSearch allows you to search on date ranges as well. The selection would contain 5000 results. ElasticSearch will return roughly 1000 results. These 1000 results will not necessarily be the most recent or the least recent, but rather a random-seeming cross section of the 5000 candidates. Of course, this specific example is not a good ElasticSearch use case as it is too broad by definition. If the user were to combine the date range with another search criterion, or if it were plausible this date range would only yield a handful of results, it would be a good ElasticSearch use case. Dependent on the range of search options the users require, you might want to use a different solution or you might want to educate your users on what type of searches they should use it for and what they can expect in terms of results.
Unfortunately, the information available on the subject is a bit fragmented and can be hard to find. The following plan of action should prove helpful:
- Read the following to help you understand the landscape: https://pdn.pega.com/sites/pdn.pega.com/files/help_v72/procomhelpmain.htm#concepts/concepts2/understandingsearch.htm
and the following to help you compile the search string relevant to your use case: https://lucene.apache.org/core/3_5_0/queryparsersyntax.html .
- Make sure the Dynamic System Setting SearchSoapURI is configured and that the Data Admin System setting indexing/useDataInstances is set to true.
- Currently (Dec 2016) there is a bug that hinders the indexing of Data instances for full-text search. As a workaround, on the “Advanced” tab of your concrete Data class you want to have indexed, associate a Harness rule – can be a dummy.
PLEASE NOTE: this step is a workaround and will no longer be relevant when the underlying issue has been fixed.
- Decide if you need any properties to be returnable and/or filterable.
– If you want to perform searches based on key-value pairs you need to make those properties filterable. If you want these specific properties to be returned as search results as well, you need to make them returnable. To make properties filterable and/or returnable, create a Custom Search Property (SysAdmin category).
– If you don’t specify any Custom Search Properties you will get full-text search hits on values in any property and your search result will be the pzInsKey’s of the instances containing the search term.
PLEASE NOTE: in Pega version 6 and older this works differently – if you’re using one of those versions you will need to check the documentation – ElasticSearch was officially introduced in 7.1.71 , its predecessor was Lucene Search in combination with a Report Definition.
- Make sure Indexing is enabled on your Work- and or Data- instances. In Pega 7.2. this is configured on the “Search” tab of your System:Settings.
- If you’ve made any of the abovementioned changes to your application to facilitate ElasticSearch you’ll need to re-index the instances you want to be searchable. In Pega 7.2. you can do this from the “Search” tab of your System:Settings.
- Write a wrapper activity that calls the activity “pxRetrieveSearchData” (Class: Rule-Obj-Report-Definition). Before calling it you’ll want to set the necessary parameters. After calling it you can manipulate the search result data to suit your needs.
That’s it! You should now have a functional search mechanism.
ElasticSearch is a powerful search mechanism that you can leverage into a quick, complete and flexible search solution for your client. It is not a catch-all solution, but the information in the sections “Drawbacks” and “Use cases” of this article should help you determine whether or not you have a fitting ElasticSearch use case . If you have a suitable use case and are clear on expectations from business users, following the 7 step plan in the section “Approach” should help you quickly develop your ElasticSearch feature and impress your client.
– Kimberley Huizing
 “Improved full text search”, Pegasystems Inc. (Feb 2, 2016). As retrieved on December 7, 2016 from the Pega Developer Network website: https://pdn.pega.com/release-note/improved-full-text-search .
 “How to change the number of records returned from a work object search/how to configure ElastiSearch?”, Q.: ElliottDavisAustin, A.: nistr. As retrieved on December 7, 2016 from the Pega Developer Network website: https://pdn.pega.com/community/pega-product-support/question/how-change-number-records-returned-work-object-searchhow .
 “Elasticsearch fails to find Data instances”, Pegasystems Inc. (August 25, 2016). As retrieved on December 7, 2016 from the Pega Developer Network website: https://pdn.pega.com/support-articles/elasticsearch-fails-find-data-instances .