Wednesday 22 March 2017

Looking for advice: How should I handle partial free text search on top of MongoDB with scalability in mind?

I'm looking for a way to replace my current regex-based search implementation which simply runs a wildcard find query on a MongoDB collection that consists of users' personal expenses:const where = { user: currentUserId, description: {$regex: new RegExp(`.*${escapedSearchString}\.*`, 'i')}, }...where an escapedSearchString value of 'star%20wars' would match descriptions like 'Star Wars action figure' and 'Rogue One: A Star Wars Story Blu-ray'.The main concern here is that regex queries are very expensive to execute especially as the collection grows, so I'm looking for scalable alternatives. MongoDB's $text search does not fit the bill because the app does not particularly target individual words in any particular language, but instead focuses on partial matches. So if a user submits a search string like 'tar w' it should match 'Star Wars'.I was thinking that spinning up an Elasticsearch instance might be an option, but I would probably have to replicate the entire Expenses collection, including most of the fields, in Elasticsearch for optimal performance (i.e. without having to run a separate MongoDB query with the matching IDs to fetch the remaining data). Does this make sense or should it be further optimized somehow?Thanks in advance and I hope other developers pondering similar questions can get something out of this too!

Submitted March 22, 2017 at 07:55PM by Mjoldur

No comments:

Post a Comment