1

I am not seeing the expected SQL generated with regard to Distinct when selecting multiple columns. In particular the Distinct is being left out of the generated SQL. Strangely, this seems to work properly when my Select clause only has 1 column. How do I achieve a Distinct SQL query using Entity Framework when selecting multiple columns? Group By also does not work when the key is multiple columns.

In these examples, "e" is my System.Data.Entity.DbContext.

Query that selects multiple columns ignores distinct:

var distinct1 = e.MyViews
                 .Select(a => new { a.Col1, a.Col2, a.Col3, })
                 .Distinct()
                 .ToList();
                

Generated SQL is missing Distinct:

SELECT 
    [Extent1].[Col1] AS [Col1], 
    [Extent1].[Col2] AS [Col2], 
    [Extent1].[Col3] AS [Col3]
FROM 
    (SELECT 
         [MyView].[Col1] AS [Col1], 
         [MyView].[Col2] AS [Col2], 
         [MyView].[Col3] AS [Col3], 
         [MyView].[UserId] AS [UserId]
     FROM 
         [MySchema].[MyView] AS [MyView]) AS [Extent1]

Query that selects single column uses Distinct:

var distinct2 = e.MyViews
                 .Select(a => new { a.Col1, })
                 .Distinct()
                 .ToList();

The generated SQL correctly includes Distinct:

SELECT 
    [Distinct1].[C1] AS [C1], 
    [Distinct1].[Col1] AS [Col1]
FROM 
    (SELECT DISTINCT 
         [Extent1].[Col1] AS [Col1], 
         1 AS [C1]
     FROM 
         (SELECT 
              [MyView].[Col1] AS [Col1], 
              [MyView].[Col2] AS [Col2], 
              [MyView].[Col3] AS [Col3], 
              [MyView].[UserId] AS [UserId]
          FROM 
              [MySchema].[MyView] AS [MyView]) AS [Extent1]) AS [Distinct1] 

Note that this is running on .Net Framework 4.8 and is the original EF 6, not EF Core.

Running against SQL Server (Azure SQL Database Managed Instance).

I have confirmed that EF Core correctly injects Distinct in the generated SQL with multiple columns in the Select, but that doesn't really help me with my .Net Framework project.

Update

I am now finding that some combinations of columns in the Select clause do work, while others do not, but I cannot determine what the differentiating factor is.

These combinations of columns in Select do produce Distinct in SQL:
Col1
Col2
Col3
Col1, Col2
Col2, Col3

These combinations of columns in Select do not produce Distinct in SQL:
Col1, Col3
Col1, Col2, Col3

7
  • Distinct works in Entity Framework 6. You'll have to provide more information. What is e? How is it configured? Is the PK or any unique index or constraint composed by those columns? If the columns are unique there's no reason to use DISTINCT Commented Jul 14, 2023 at 14:02
  • Also, Entity Framework 6 or EF Core 6? What database are you using? The SQL query is created by the database provider, not EF itself. Some providers are smarter than others Commented Jul 14, 2023 at 14:05
  • Despite what the SQL comes back with, are the results distinct or not? It may be that EF thinks in some cases it might be more efficient to do in memory. Is this with a clean DBContext with no entities loaded/tracked? Normally when looking for distinct sets of values from a table/view it is better to use GroupBy rather than Distinct. Another question would by why the SELECT is selecting a UserID if that is not in your Select expression. Commented Jul 24, 2023 at 10:31
  • @StevePy, group by is also getting ignored for this case. This is a clean DBContext that was instantiated just before running the select. The results of the query are absolutely not distinct. It is returning thousands of records, where the correct distinct would produce less than 100. Commented Jul 31, 2023 at 12:58
  • This requires a minimal reproducible example because we need to see everything that is involved, esp. data types, unique indexes and EF's mappings. If EF has any information in its model that the queried combination is surely unique it won't add Distinct. Commented Jan 7, 2024 at 19:17

1 Answer 1

0

I had the same problem Wwe changed the view where we were executing the distinct from EF, making the nullable colums not nullable. This made the new non nullable colums "primary keys" of the view, making the distinct working. Our idea is that since EF thought that the columns we were selecting were the all primay keys, wasn't applying the distinct correctly.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.