I have table A with millions of rows, and there's a one to many relationship to table B, where each entry for the item in A will only have 1-4 values. When I query, as part of my returned data, I need to include the most recent PromotionDay from B, so I did this, which runs in under a second:
var query = context.A.AsNoTracking().Select(x => new {
PromotionDay = context.B.Max(p => p.PromotionDay),
// Lots of other properties from A
});
This is a search API, so sometimes they'll send a PromotionDay filter, meaning I only want those records to be returned.
I naively tried this:
if (filters.PromotionDate is { } promotionDate)
query = query.Where(x => x.PromotionDay == promotionDate);
Now the query takes over 40 seconds. When I look at the generated SQL, I see it's adding that WHERE clause as a subquery. What's the proper way to do this so the filter is applied in the database, not loading millions of rows to memory and then applying filters?
WHERE (
SELECT max(promotion_day)
FROM B
WHERE ...
) = @__promotionDate_0
So it's now running that same MAX(promotion_day) query repeatedly, and the query takes 43 seconds to run.