Imagine the scene: you’ve walked into a store intending to buy a coffee maker. It’s the end of the week, you’re tired, you want to finish your shopping and get home. After a few minutes wandering the aisles you approach a sales assistant and ask them for help. Here’s how the conversation goes:

Customer: “I need a coffee maker”
Assistant: “Sure, here’s 305 things you could buy all related to coffee.”

Customer: “No, I said coffee maker, not ground coffee, coffee beans or coffee-flavoured chocolate. Try again, I want a coffee maker from the kitchen accessories department”
Assistant: “Sure, here’s 55 things from that department related to coffee makers.”

Customer: “You’re misunderstanding me! I want one of those glass things with the bit you press down – cafe tear I think they’re called. I can’t see any of those in your list”
Assistant: “I’m sorry we don’t have any cafe tears.”
Customer: “Sorry, perhaps it’s my accent, do you have anything that sounds like ‘cafe tear’?”
Assistant: “We have one thing, it’s called a ‘cafetiere’, here it is.”
Customer: “That’s right! But the one you’re showing me is for 12 cups, I want a smaller one with a metal handle. You’re a big store, surely you sell more than one type? Look again for cafetieres?”
Assistant: “We actually have 12 different cafetieres – here they all are.”
Customer: “Finally! Why on earth didn’t you show me these just now, or when I first asked for a coffee maker? They’re only used for making coffee! I nearly walked out of this store, you’re lucky I’m still here…”


This conversation is based on some actual interactions with a real ecommerce site search engine on a major UK supermarket website. It illustrates some common problems: the customer using different terms to the merchant, over-long or over-short result lists, no automatic phrase boosting or spelling suggestions and eventually a frustrated customer who very nearly goes to a competitor. If a real-world sales assistant behaved like this they probably wouldn’t last long in the job!

Ecommerce is now a vital lifeline for many people and a major business driver – a study in 2020 from Emarketer showed that “UK retail ecommerce sales will account for 27.5% of total retail sales this year, and that proportion will approach one-third by 2024”.The COVID-19 pandemic has hugely accelerated a shift that was already underway and businesses that don’t provide effective tools for ecommerce – including site search, the equivalent of a sales assistant – are at huge risk of loss of sales and brand reputation. Online consumers are fickle and it’s far easier to switch to a different website than it is to walk across town to a different store – and often that website is Amazon or another giant competitor.


In short, ecommerce site search is broken. The reasons for this are manifold: the search engines provided by commercial ecommerce software are often badly integrated, out of date, hard to control and provide no way to measure search quality. Marketers, who best understand how to match customer needs to inventory – they know a cafetiere is a coffee maker and how many cafetiere types you sell – are seldom provided with the tools to influence or tune search results, or even told much about how the search engine works. IT, tasked with keeping the lights on, may not be aware of business objectives or targets and thus find it hard to prioritise search-related issues. Lastly, divining the actual intent of a customer from a two-word phrase is difficult if not impossible.

There are obvious benefits of improving site search – more successful searches lead directly to more conversions and thus revenue – but there are other benefits. An ability to examine site search logs and other pointers to user behaviour may reveal those items customers are searching for that a merchant doesn’t provide, a pointer to expanding inventory or to new trends and needs. Who would have predicted in 2019 that facemasks and hand sanitiser would need to be so widely available in 2020?

Our approach at OpenSource Connections (OSC) to improving search can be summed up as ‘measure, experiment, repeat’. The first step is developing effective measurements of search quality – you need to know how bad (or good) search results are, and you must be able to measure this on a repeatable and frequent basis. The second step is to be able to easily make changes to search engine configuration and to assess the impact of these changes – the ability to experiment, rapidly and safely, offline. Once an offline experiment shows measurable improvements it can be promoted to online where A/B testing and click logs can be used to further measure impact.

This culture of rapid experimentation must be developed across the whole search team – not just within IT. We need to provide tools that marketers can use to react to rapidly changing situations, but we also need to base our testing on solid data. Our tools also need to be widely available, not tied to a particular platform or technology stack, well documented and battle tested. We need to give full control of search back to the people who need it.

OSC has been working with a number of others across the industry to bring together a suite of freely available, open source tools that can be used to build measurable and tunable ecommerce site search. The group has christened this initiative Chorus and based the development on one of the two leading open source search engines, Apache Solr, which is widely used in ecommerce, sometimes as part of commercial packages. A variant for Elasticsearch, the other popular engine, is in active development.

OSC’s Quepid tool is one part of the ensemble, allowing one to create test cases, add queries to those cases and collaborate with subject matter experts to give judgements of search quality. Quepid lets users (who need no deep search expertise) ‘rate’ search results on a scale using a simple web interface and gives an overall quality score. Importantly once a change has been made to the search engine configuration, Quepid can easily re-run a test and the change to the overall quality is shown, allowing promising experiments to be identified.

Another part of the suite allows business rules to be added directly to the search engine, for example synonyms, to help with the problem that your customers may not use the same language as you do when describing products. Boosting is another technique available to move certain results higher up the list. It is also possible to turn dimensions into ranges – for example, to match up a customer looking for a 33-inch TV screen and a merchant who sells 32-inch and 36-inch screens, both of which may be acceptable results as they are close in size. Querqy is a query preprocessor that helps turn this customer language into an effective search query, and SMUI is a web interface that helps manage these business rules. These two tools give search teams improved capabilities in active search management, also known as ‘searchandising’.

Let’s return to our example above: how might you fix it with Chorus? First, we would use our search logs to make sure we were testing common search queries: if ‘coffee maker’ was a common query then a test should be run for it (if not, perhaps there are more important things for our team to consider given limited resources and time). We would then use Quepid to create a test case including the query ‘coffee maker’ and ask our subject matter experts – our marketers – to rate the results. Using this ground truth data we would try some experiments to see if we could improve things: perhaps ‘cafetiere’ (or ‘french press’) should be created as a synonym for ‘coffee maker’, or a boost applied if the result was in the ‘kitchen accessories’ category. We can try both these techniques using SMUI and Querqy and rapidly see how results are affected. With Solr or Elasticsearch we can also try different spelling suggester configurations which might help with ‘cafe tear’. Eventually once our offline testing had identified some candidate improvements we would consider this for online testing.

There are several other components which can assist with large scale batch testing, finding optimum configuration parameters and automated deployment of the platform. Our group is already working with a number of leading ecommerce websites to deploy Chorus and give control of site search back to search teams. We also welcome any contributions to the project.

Come and join the Chorus!


  1. UK Ecommerce 2020 – Digital Buying Takes Hold as Pandemic Decimates the High Street https://www.emarketer.com/content/uk-ecommerce-2020
  2. Test your site search with a free downloadable assessment guide  ww.opensourceconnections.com/guide/ecommerce
  3. Meet Pete the Product Owner – a series of blogs and videos demonstrating Chorus https://opensourceconnections.com/blog/2020/07/07/meet-pete-the-e-commerce-search-product-manager/
  4. https://github.com/querqy/chorus to download Chorus
  5. www.querqy.org for Chorus documentation
  6. Join the free search community Relevance Slack at www.opensourceconnections.com/slack

Leave a Reply