Blog
/
Product

A more performant `search` API ahead of BFCM 2024

Published
November 27, 2024
Last updated
November 27, 2024
The search API in your Gadget apps will now use an ElasticSearch cluster to handle queries and reduce overall database usage.

TLDR: Searching for records is much faster (thanks to ElasticSearch)!

Searching for data in Gadget just got a major speed boost!

Gadget’s infrastructure team has been working around the clock to scale up resources for this year’s BFCM. To free up Postgres database resources and improve the response time of the <inline-code>search<inline-code> API, the team has been setting up, fine-tuning, and deploying an ElasticSearch cluster to handle these otherwise expensive queries.

Thanks to the introduction of ElasticSearch under the hood, Gadget’s <inline-code>search<inline-code> API is more performant than ever.

This means that:

  • All existing queries on the platform should see a performance improvement.
  • Apps will have less overall database usage, resulting in lower monthly bills.
  • Writes will have improved throughput thanks to less work done by the core Postgres database.
  • Improved read performance because the database has less data to parse to respond to queries.

Because the Gadget infrastructure team is handling these changes under the hood, developers using Gadget will see these improvements ahead of BFCM without lifting a finger.

Hello, ElasticSearch

All read APIs in Gadget come with a built-in <inline-code>search<inline-code> API that can be used across most field types to power features like autocomplete or table search. This API is not new, but thanks to ElasticSearch, responses will be returned faster and queries consume fewer core database resources, improving overall platform responsiveness.

Fun fact #1: While searching for records on your model <inline-code>data<inline-code> pages, the same <inline-code>search<inline-code> API is used under the hood.

These searches were previously performed on Gadget’s core Postgres database. This meant that database resources were being used to power search, which could negatively impact search and core database performance by consuming both CPU and I/O resources as the number of models across all Gadget apps and environments continue to grow.

To continue to scale resources for the thousands of apps built on Gadget for big events like BFCM, the Gadget infrastructure team introduced an ElasticSearch cluster to the Gadget stack. The changes improve search speed and core database performance by handling all searching done on your Gadget apps, removing the computation burden from the core Postgres database.

What changed under the hood?

Until now, searching in Postgres was fast enough to power search in existing Gadget apps without having a major impact on core database performance. It allowed Gadget to keep the underlying infrastructure powering your application’s data layer relatively simple and easy to update and manage. ElasticSearch will improve performance at the cost of adding additional complexity to the Gadget stack under the hood.

Before ElasticSearch, the core Postgres database faced heavy CPU and I/O consumption. The high CPU usage was largely caused by computing <inline-code>tsvector<inline-code>s that are used to analyze fields and break them up into search terms, and the indexing of that vector using a computationally expensive GIN index. High I/O occurred because every <inline-code>create<inline-code> or <inline-code>update<inline-code> to a record in the database required a separate field update. This resulted in a brand new row being created in Postgres, resulting in a “dead” row and increasing overall database usage. This is known as “table bloat” in Postgres, which results in slower queries and increased disk space and I/O, and requires regular VACCUMs to manage.

Moving to a new search-focused database required additional migration logic to ensure that ElasticSearch stays up to date with the latest data available in Postgres. To solve this issue, a Change Data Capture (CDC) system was set up to index records in the database quickly and efficiently. This also includes a multitude of reindexing workflows that allow the infrastructure team to change how data is indexed in ElasticSearch.

This was in addition to standing up an ElasticSearch cluster, which required tuning the indices and indexing strategies used in Gadget apps under the hood to fit within the constraints of ElasticSearch. This was a challenging task because every Gadget app has a different schema.

We made a few attempts to design an optimal index strategy. Initially, we made each model an index in ElasticSearch, but this did not work. The negative performance impact of using each model as an index was far greater than the previous setup with Postgres, where each model was a table. After some trial and error, we managed to strike a balance between the number of indices and overall performance, and worked around other ElasticSearch constraints (such as max field count per index) by using consistent hashing to share indices between apps. And since most models in Gadget are Shopify data models, we get a ton of reuse on indices. 

Fun fact #2: Due to the number of apps and models in the Gadget ecosystem, reindexing all Gadget apps takes roughly two days to complete. We have reindexed all apps multiple times while testing different strategies and configurations.

We also needed to ensure API requests containing <inline-code>search<inline-code> and <inline-code>filter<inline-code> or <inline-code>sort<inline-code> conditions behaved appropriately. ElasticSearch needs to perform that filtering and sorting which means that we need to generate ES queries for all <inline-code>filter<inline-code> and <inline-code>sort<inline-code> conditions included on reads. To achieve this, we now transform all user-defined <inline-code>filter<inline-code> and <inline-code>sort<inline-code> statements into Gelly queries, then convert that Gelly query to an ElasticSearch query.

As a final step, search query results from the legacy Postgres search were compared against ElasticSearch results. Result sets are expected to be a bit different due to how ElasticSearch handles relevancy scoring under the hood, but the infrastructure team determined that results were close enough to go forward and deploy ElasticSearch into production.

Results using the <inline-code>search<inline-code> API that do not include <inline-code>filter<inline-code> or <inline-code>sort<inline-code> conditions will now be ordered based on relevancy as determined by ElasticSearch. Because of this, changes to search have not yet been rolled out to apps using search in production, giving the Gadget team more time to understand and analyze existing use cases (and so we don’t break any production functionality!).

Note: The <inline-code>search<inline-code> API cannot be combined with related model filtering.

More improvements to search in Gadget are planned for the future. Keep an eye on our changelog for updates. If you have questions about the <inline-code>search<inline-code> API or any of the under-the-hood changes done to scale up for BFCM, feel free to ask the team in our developer Discord.

Jason Gedge
Author
Riley Draward
Reviewer
Try Gadget
See the difference a full-stack development platform can make.
Create app

A more performant `search` API ahead of BFCM 2024

The search API in your Gadget apps will now use an ElasticSearch cluster to handle queries and reduce overall database usage.
Problem
Solution
Result

TLDR: Searching for records is much faster (thanks to ElasticSearch)!

Searching for data in Gadget just got a major speed boost!

Gadget’s infrastructure team has been working around the clock to scale up resources for this year’s BFCM. To free up Postgres database resources and improve the response time of the <inline-code>search<inline-code> API, the team has been setting up, fine-tuning, and deploying an ElasticSearch cluster to handle these otherwise expensive queries.

Thanks to the introduction of ElasticSearch under the hood, Gadget’s <inline-code>search<inline-code> API is more performant than ever.

This means that:

  • All existing queries on the platform should see a performance improvement.
  • Apps will have less overall database usage, resulting in lower monthly bills.
  • Writes will have improved throughput thanks to less work done by the core Postgres database.
  • Improved read performance because the database has less data to parse to respond to queries.

Because the Gadget infrastructure team is handling these changes under the hood, developers using Gadget will see these improvements ahead of BFCM without lifting a finger.

Hello, ElasticSearch

All read APIs in Gadget come with a built-in <inline-code>search<inline-code> API that can be used across most field types to power features like autocomplete or table search. This API is not new, but thanks to ElasticSearch, responses will be returned faster and queries consume fewer core database resources, improving overall platform responsiveness.

Fun fact #1: While searching for records on your model <inline-code>data<inline-code> pages, the same <inline-code>search<inline-code> API is used under the hood.

These searches were previously performed on Gadget’s core Postgres database. This meant that database resources were being used to power search, which could negatively impact search and core database performance by consuming both CPU and I/O resources as the number of models across all Gadget apps and environments continue to grow.

To continue to scale resources for the thousands of apps built on Gadget for big events like BFCM, the Gadget infrastructure team introduced an ElasticSearch cluster to the Gadget stack. The changes improve search speed and core database performance by handling all searching done on your Gadget apps, removing the computation burden from the core Postgres database.

What changed under the hood?

Until now, searching in Postgres was fast enough to power search in existing Gadget apps without having a major impact on core database performance. It allowed Gadget to keep the underlying infrastructure powering your application’s data layer relatively simple and easy to update and manage. ElasticSearch will improve performance at the cost of adding additional complexity to the Gadget stack under the hood.

Before ElasticSearch, the core Postgres database faced heavy CPU and I/O consumption. The high CPU usage was largely caused by computing <inline-code>tsvector<inline-code>s that are used to analyze fields and break them up into search terms, and the indexing of that vector using a computationally expensive GIN index. High I/O occurred because every <inline-code>create<inline-code> or <inline-code>update<inline-code> to a record in the database required a separate field update. This resulted in a brand new row being created in Postgres, resulting in a “dead” row and increasing overall database usage. This is known as “table bloat” in Postgres, which results in slower queries and increased disk space and I/O, and requires regular VACCUMs to manage.

Moving to a new search-focused database required additional migration logic to ensure that ElasticSearch stays up to date with the latest data available in Postgres. To solve this issue, a Change Data Capture (CDC) system was set up to index records in the database quickly and efficiently. This also includes a multitude of reindexing workflows that allow the infrastructure team to change how data is indexed in ElasticSearch.

This was in addition to standing up an ElasticSearch cluster, which required tuning the indices and indexing strategies used in Gadget apps under the hood to fit within the constraints of ElasticSearch. This was a challenging task because every Gadget app has a different schema.

We made a few attempts to design an optimal index strategy. Initially, we made each model an index in ElasticSearch, but this did not work. The negative performance impact of using each model as an index was far greater than the previous setup with Postgres, where each model was a table. After some trial and error, we managed to strike a balance between the number of indices and overall performance, and worked around other ElasticSearch constraints (such as max field count per index) by using consistent hashing to share indices between apps. And since most models in Gadget are Shopify data models, we get a ton of reuse on indices. 

Fun fact #2: Due to the number of apps and models in the Gadget ecosystem, reindexing all Gadget apps takes roughly two days to complete. We have reindexed all apps multiple times while testing different strategies and configurations.

We also needed to ensure API requests containing <inline-code>search<inline-code> and <inline-code>filter<inline-code> or <inline-code>sort<inline-code> conditions behaved appropriately. ElasticSearch needs to perform that filtering and sorting which means that we need to generate ES queries for all <inline-code>filter<inline-code> and <inline-code>sort<inline-code> conditions included on reads. To achieve this, we now transform all user-defined <inline-code>filter<inline-code> and <inline-code>sort<inline-code> statements into Gelly queries, then convert that Gelly query to an ElasticSearch query.

As a final step, search query results from the legacy Postgres search were compared against ElasticSearch results. Result sets are expected to be a bit different due to how ElasticSearch handles relevancy scoring under the hood, but the infrastructure team determined that results were close enough to go forward and deploy ElasticSearch into production.

Results using the <inline-code>search<inline-code> API that do not include <inline-code>filter<inline-code> or <inline-code>sort<inline-code> conditions will now be ordered based on relevancy as determined by ElasticSearch. Because of this, changes to search have not yet been rolled out to apps using search in production, giving the Gadget team more time to understand and analyze existing use cases (and so we don’t break any production functionality!).

Note: The <inline-code>search<inline-code> API cannot be combined with related model filtering.

More improvements to search in Gadget are planned for the future. Keep an eye on our changelog for updates. If you have questions about the <inline-code>search<inline-code> API or any of the under-the-hood changes done to scale up for BFCM, feel free to ask the team in our developer Discord.

Interested in learning more about Gadget?

Join leading agencies making the switch to Gadget and experience the difference a full-stack platform can make.