In the past months we redesigned our Customer App to be able to add some new features like a personalized home for the user and a page where the user can discover products and shops.
This is what came out:
When the user opens the App and he has a valid access token we show him the Home screen. At the moment we do not have a Backend For Frontend or something like GraphQL that returns all the data needed doing one call, so we need to make different calls to our APIs. A couple of calls are done in the UI thread and all the others are made in background.
After a load test, simulating the behavior of our users, we noticed that some of the endpoints we were calling in background to load the Home screen were quite slow (>= 1 sec). One of them was the endpoint to get the notifications sent to a customer which returns:
- The number of notifications not read
- The first X notifications
We expected that endpoint to be slow because we were getting the data we needed from a couple of tables which were not exactly made for reading but 1.5 seconds was really a lot.
With the Application Insights profiler we found out the request was spending quite a lot of time waiting for the Entity Framework ToListAsync method, which makes sense because we need to get the data from the database but at the same time it is pretty weird that it takes so much time to have an average response time of 1.5 seconds.
After googling a little bit around we found these posts:
The summary of those links is that EF and EF Core async methods do not work well with big data types (varchar(MAX), nvarchar(MAX), varbinary(MAX), text, …).
We decided to take a look at the SQL query and the table and we discovered we were filtering on:
- Columns without indexes
- a nvarchar(MAX) column
Yes, we should not have such code in production but that table was created for a different purpose and then adapted to work with our Customer App. Still not an excuse, but shit happens.
When we were running the load test, Bender, our monitoring bot, notified us that the database was under load.
Checking SQL Query Performance Insight we saw two queries were using quite a lot of CPU, and those queries were the ones used to get the number of notifications not read and the first X notifications.
So, the problem has been identified. How do we solve it?
We have different solutions:
- Create the missing indexes and re-size the nvarchar(MAX) column to use less space
- Re-design the table
- Create a new table just for reading data
We decided to go for the 1st solution, which is not the best, but we wanted a quick solution to solve the problem.
We reduced the size of the column with varchar(MAX), created the missing indexes and ran another load test: 239ms on average, quite a nice improvement.
Why is the 1st solution not the best?
- It’s not the best thing to have an index on a varchar column
- We are still making a couple of queries to retrieve the data we need
Why did we choose the 1st solution if it is not the best?
Because we are changing the way we get data for the Customer App… but we will talk about it in the next post.