Poor search quality results in a frustrating user experience. If you cannot show your customers targeted, personalized results that solve their problem, you will quickly lose them to a plethora of other options that they have on their fingertips. In today’s connected world, users have more choices than ever and if you cannot quickly and accurately give them what they need, your competitors would. These disappointed users will probably never return to your service. Bad search relevance essentially translates to bad service.
Good search results, on the other hand, keep users on the product and get them hooked. They fall in love with what they see and discover. This results in a far superior user experience, engagement, and loyalty. Improving relevance is a long-term investment. You will see an uptick in all key business metrics and better conversions once you are able to serve better listings and recommendations to your customers. If you surface the most appropriate, valuable and high quality content at the right time, it will drive results.
Search Relevance is when your application accurately and intuitively answers questions asked by your users. It learns over time and personalizes results based on deep understanding, learning, domain knowledge, and the user’s likes and preferences. Your search algorithms need to sort the document results so that the content most relevant to a query are shown first. The search results need to answer user’s questions and solve problems.
Your goal should be to provide your users with the most comprehensive and relevant search results possible. If there is relevant data available, you want to ensure it can be found quickly and easily. Essentially, you have to be very good at telling the difference between good and bad matches. You also want to quickly and accurately identify and eliminate incorrect data, spam, predatory offers, duplicate listings or misleading information from your search results. With better data shown to users, you will see an increase in time spent on your product, repeat visits and number of searches performed.
You need to evaluate search against well-defined metrics and outcomes to scientifically and statistically measure the performance of your search optimization efforts. Every experiment that your product, engineering, data science and machine learning teams undertake with changes in Elasticsearch, Solr or Lucene should be tracked, monitored, measured and analyzed closely across well-defined key performance indicators (KPIs). The search quality team needs to eventually measure which changes translated into happier and more profitable customers – and then invest more in those initiatives. You need to improve your machine learning algorithms based on measurement of search results. If you are not measuring results and analyzing them, you are driving in the dark based on gut feeling - which in most cases turns out to be wrong.
Improving search involves continuously iterating through a wide range of techniques and running A/B tests.
The way users type a search query can often be fuzzy and not very clearly aligned with their intent. This is why you need to use humans-in-the-loop with machines to create high quality training data sets and ask them to judge search results. This will help improve the precision of your models over time and the relevancy will automatically adapt to changes in user behavior.
You should build a process to ask human judges to score quality of search results against a query. These metrics will help you compare the performance of difference search algorithms, understand how humans think about searching on your product, and identify specific areas for improvement.
The judgment process is subjective, and different people will make different choices. To solve this problem, you need to ask multiple people to assess the same results to get an average score. It is critical that you partner with a vendor whose human judges do not game the system by answering fast and erroneously. There are several techniques to identify bad human judges. One common technique is to randomly inject questions where you know what the relevance score should be, and then check that the judges answer most of those correctly. Another technique is to review the input from judges who consistently answer differently from other judges for the same tasks. Having this process in place will help you avoid poor quality and misleading training data.
Today, data is growing at an unprecedented speed. With advances in artificial intelligence (AI), machine learning, natural language processing and computing power you have the tools and infrastructure to dramatically improve the quality of your search. Organizations need to create a centralized Search Quality and Relevance team that needs to be tightly integrated with product teams. It’s time for search to take center stage to differentiate your product and build customer loyalty.