1

I have a JSON input that looks like

{
    "sessionId": 1234,
    "deviceId": "MAC:1234",
    "IoTHub": {
        "MessageId": "1234-1234-1234-1234"
    }
}

How can I extract the values of sessionId, deviceId and MessageId in a Azure Datalake Analytics usql script?

1 Answer 1

1

How to use JSON file formats in the context of Azure Data Lake Analytics respectively usql

Set Up process

  • Download the repo from [1]
    • Open the solution in .\Examples\DataFormats\Microsoft.Analytics.Samples.sln
    • Build the solution
    • Get .\Examples\DataFormats\Microsoft.Analytics.Samples.Formats\bin\Debug\Microsoft.Analytics.Samples.Formats.dll
    • Get .\Examples\DataFormats\Microsoft.Analytics.Samples.Formats\bin\Debug\Newtonsoft.Json.dll
  • In ADLS create a folder where to store the assemblies (e.g. .\assemblies)
    • Create a sample file in the folder if it disappears (happened to me)
  • With Visual Studio:
    • Add both files to the folder of the ADLA solution (e.g. .\lib\...)
    • Open Cloud Explorer, navigate to the ADLA Database -> Assemblies -> right click and register the assemblies
      • Store the assemblies in the folder created before in ADLS
  • Reference the assemblies as can be seen below

Usage

JSON Input Data

  • Use the Extractor as e.g. shown below
    • For more information see [2] and [3]

Resources

[1] GitHub Azure USQL

[2] GitHub Azure USQL DataFormats

[3] U-SQL - Extract data from json-array


U-SQL Script

DECLARE @localDevelopment bool = true;

IF @localDevelopment == true THEN
    DROP ASSEMBLY IF EXISTS [Newtonsoft.Json];
    DROP ASSEMBLY IF EXISTS [Microsoft.Analytics.Samples.Formats];
    CREATE ASSEMBLY [Newtonsoft.Json] FROM @"/lib/Newtonsoft.Json.dll";
    CREATE ASSEMBLY [Microsoft.Analytics.Samples.Formats] FROM @"/lib/Microsoft.Analytics.Samples.Formats.dll";
    DECLARE @input string = @"/data/input.json";
    DECLARE @output string = @"/data/output.csv";
ELSE
    DECLARE @input string = @"/data/input.json";
    DECLARE @output string = @"/data/output.csv";
END;

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 

//Define schema of file, must map all columns
//Names must match keys
@extractDataFirstLevel = 
    EXTRACT sessionId int,
            deviceId string,
            IoTHub string
            //Date DateTime
    FROM @input
    USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

@selectData =
    SELECT sessionId,
           deviceId,
           Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(IoTHub)["MessageId"] AS messageId
    FROM @extractDataFirstLevel;

OUTPUT @selectData
TO @output
USING Outputters.Csv();
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.