5

I have a query where I want to return all the rows which are associated with a list of values. You could write this very simply as:

select * from TableA where ColumnB in (1, 2, 3, 5)

I could generate this query in C# and execute it. However this is obviously less than ideal as it doesn't use parameters, it will suffer when trying to cache query plans and is obviously vulnerable to a SQL injection attack.

An alternative is to write this as:

select * from TableA where ColumnB = @value

This could be executed many times by C#, however this will result in N DB hits.

The only other alternative I can see is to create a temp table and join it that way, however I don't see this point of this as it would be more complex and suffer from the same limitations as the first option.

I'm using SQL server and OLDB, creating the query isn't the issue. I'm trying to create the most efficient process.

Which of these three methods is more efficient? Have I missed an alternative?

5
  • how do you want to execute the query? EF, LINQ, ADO, OLEDB? Commented Jul 19, 2012 at 13:05
  • And which server? MySql, MsSql, other? Commented Jul 19, 2012 at 13:07
  • OLDB and MsSQL, question updated Commented Jul 19, 2012 at 13:19
  • are you using sql2008 ? I can provide another approach using Table Value Parameter Commented Jul 19, 2012 at 13:39
  • 1
    Please specify SQL Server version when asking SQL Server questions. This will prevent people from spending time developing solutions that you can't use. Commented Jul 19, 2012 at 13:43

3 Answers 3

4

Assuming SQL Server 2008 or newer, in SQL Server, create a table type once:

CREATE TYPE dbo.ColumnBValues AS TABLE
(
  ColumnB INT
);

Then a stored procedure that takes such a type as input:

CREATE PROCEDURE dbo.whatever
  @ColumnBValues dbo.ColumnBValues READONLY
AS
BEGIN
  SET NOCOUNT ON;

  SELECT A.* FROM dbo.TableA AS A
    INNER JOIN @ColumnBValues AS c
    ON A.ColumnB = c.ColumnB;
END
GO

Now in C#, create a DataTable and pass that as a parameter to the stored procedure:

DataTable cbv = new DataTable();
cbv.Columns.Add(new DataColumn("ColumnB"));

// in a loop from a collection, presumably:
cbv.Rows.Add(someThing.someValue);

using (connectionObject)
{
    SqlCommand cmd        = new SqlCommand("dbo.whatever", connectionObject);
    cmd.CommandType       = CommandType.StoredProcedure;
    SqlParameter cbvParam = cmd.Parameters.AddWithValue("@ColumnBValues", cbv);
    cbvParam.SqlDbType    = SqlDbType.Structured;
    //cmd.Execute...;
}

(You might want to make the type a lot more generic, I named it specifically to make it clear what it is doing.)

Sign up to request clarification or add additional context in comments.

Comments

2

You can also use multiple resultsets and send a bounch of query like this:

select * from TableA where ColumnB = @value0
select * from TableA where ColumnB = @value1
select * from TableA where ColumnB = @value2
...
select * from TableA where ColumnB = @valuen

in a single call. even if apparently counter intuitive it leverages execution plan and is safe in term of parametrization.

6 Comments

I'm curious about the -1 I received
Isn't this exactly what the OP said they didn't want to do?
@AaronBertrand not exactly: he want to use the execution plan, and this will. Note they are not separated roundtrip, but all the queries are executed in a single rountrip to the DB. As a plus the query is correctly parametrized, to avoid injection.
And if ColumnB is not indexed, this will require N scans of the entire table. And even if the OP's column is indexed, the next reader's column might not be. Round trips is not the only reason splitting this up into multiple queries makes it less efficient.
if it is not indexed, scans happens with the In too
|
0

You can easily write this:

String csvString = "1, 2, 3, 5"; // Built the list somehow, don't forget escaping
String query = "select * from TableA where ColumnB in (" + csvString + ")";

By this way, performance doesn't decreased, and you can prevent Sql Injection simply escaping input values while creating csvString.

BTW, if you use MS SQL instead of standard SQL, you can find alternative ways.

2 Comments

Yes, this is how I'm currently doing it. My problem is that this will create a new Execution Plan each time this query executes as the command is different thereby slowing performance considerably...
@Liath you can avoid some of this by using the optimize for ad hoc workloads setting. This way plans are not cached until a specific query has been executed twice. Your query is still going to yield a scan but it won't take up space in your plan cache unless it should (e.g. it really is re-used).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.