Click here to Skip to main content
15,946,316 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
I'm trying to build a semantic search engine for my production application. I tried building one from scratch. I did the tokenization, vectorization,... & other steps myself, but the whole process was consuming a lot of time.

So I decided to use tools like Elastic Search, Amazon Kendra, Qdrant, etc. for the same. I finalized 'Amazon Kendra' after some research as it is a specialized 'Enterprise Search Engine'.

When I tested the efficiency of the search engine, I realized that Kendra isn't very powerful. It is more suitable for keyword search than semantic search. For example, I took an e-commerce dataset related to clothing. The word 'Shirts' appears multiple times in the dataset, but in my query, when I give an input with a misspelled word like, "Show me records related to shurts", it doesn't recognize the user intent. It gives an error message, "No results found". Kendra doesn't even implement something as simple as a fuzzy logic.

What I have tried:

Have I made a wrong choice by choosing Kendra? Are there better alternatives? Or am I supposed to implement the fuzzy logic myself in the Python code when working with Kendra?
Posted
Updated 19-Sep-23 10:35am
v2

The documentation at Amazon Kendra Features - Amazon Web Services[^] suggests it offers intelligent searching. You need to study it in detail.
 
Share this answer
 
The problem isn't with Kendra. It's with your understanding of how it works and your expectations. There's plenty of areas that can cause your searches to fail to return expected results. This can be how you're indexing your documents, which filters/tokenizers/analyzers you're using to index, which f/t/a you're using to parse queries, using the spell checker for queries, ... and more, and any combination of these.

Google for "kendra search misspelled words" and start reading for suggestions and ideas.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900