0

I'm using U-SQL with a table in Azure Data Lake Analytics. Is there any way to pass a list of partition keys generated in a C# program to the U-SQL script then have the script return all the elements in those partitions?

1 Answer 1

2

Do you want to run the C# code on your dev box and pass values to a U-SQL script or run C# code inside your U-SQL Script? Your description is not clear. Based on your question title, I will answer your first question.

Passing values as parameters from a C# program: The ADLA SDK (unlike Azure Data Factory) does not yet provide a parameter model for U-SQL scripts (please file a request at http://aka.ms/adlfeedback, although I know it is on our backlog already, having external customer demand helps in prioritization).

However it is fairly easy to add your parameter values by prepending DECLARE statements like the following in the beginning of the script and have the script refer to them as variables.

DECLARE @param = new SqlArray<int>( 1, 2, 3, 4); // 1,2,3,4 were calculated in your C# code (I assume you have int partition keys).

Then you should be able to use the array in a predicate (e.g., @param.Contains(partition_col)). That will not (yet, we have a workitem for it) trigger partition elimination though.

If you want partition elimination, you will have to have a fixed set of parameter values and use them in an IN clause. E.g., you want to check up to 3 months, you would write the query predicate as:

WHERE partition_col IN (@p1, @p2, @p3);

And you prepend definitions for @p1, @p2 and @p3, possibly duplicating values for the parameters you do not need.

Sign up to request clarification or add additional context in comments.

7 Comments

So all of the parameter values must be hardcoded ahead of time?
I have a C# function (getKeys) that generates a list of string partition keys to query a Data Lake Store database with. Is there any way to query the database with U-SQL with those partition keys (i.e. not hardcoded) in such a way that partition elimination occurs? My code that doesn't result in partition elimination is below: @results = SELECT RowKey FROM Schema.TableName WHERE USQLApp.queryHelper.getKeys().Contains(PartitionKey)
Currently only simple comparison predicates where the values can be constant-folded (e.g., are known at compile time) can be used for partition elimination. The hard coding of the parameter values is not that big of an issue since every script will be compiled on submission. Thus adding the parameter values at the submission time works most of the time.
If you can only hardcode parameters, is it possible to use U-SQL and ADLA with a .NET application in which the user can indirectly change parameters? In other words, if the user can set filters that alter the partition keys being queried, is it at all possible to change the query?
If script is a C# variable containing your script with a variable parameter @p, and p is the variable containing your parameterized value, you would write something like the following before submitting it with the SDK: script = String.Format("DECLARE @p = {0}; {1}", p, script); Obviously, you should also guard against SQL Injections if p is a string.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.