1

I'm currently working with a survey API to retrieve results and store them in our data warehouse (SQL database). The results are returned as a JSON object, which includes array ("submissions"), containing each individual's responses. An individual submission contains an array ("answers") with each answer to the questions in the survey.

I would like each submission to be one row in one table.

I will provide some very simple data examples and am just looking for a general way to approach this problem. I certainly am not looking for an entire solution.

The API returns a response like this:

{
  "surveyName": "Sample Survey",
  "count": 2,
  "submissions": [
    {
      "id": 1,
      "created": "2021-01-01T12:00:00.000Z",
      "answers": [
        {
           "question_id": 1,
           "answer": "Yes"
        },
        {
           "question_id": 2,
           "answer": 5
        }
      ],
    },
    {
      "id": 2,
      "created": "2021-01-02T12:00:00.000Z",
      "answers": [
        {
           "question_id": 1,
           "answer": "No"
        },
        {
           "question_id": 2,
           "answer": 4
        }
      ],
    }
  ]
}

Essentially, I want to add a row into a SQL table where the columns are: id, created, answer1, answer2. Within the Sink tab of the Copy Data activity, I cannot figure out how to essentially say, "If question_id = 1, map the answer to column answer1. If question_id = 2, map the answer to column answer2."

Will I likely have to use a Data Flow to handle this sort of mapping? If so, can you think of the general steps included in that type of flow?

3
  • Check this method out uses native SQL JSON abilities: stackoverflow.com/a/60175144/1527504 Commented Aug 4, 2021 at 21:22
  • 1
    Yes, I would use a data flow and utilize a Derived Column to set the proper value using either an iif() statement or case() statement. Commented Aug 5, 2021 at 6:47
  • Please upvote this for nested json array handling in ADF: feedback.azure.com/d365community/idea/… Commented Mar 15, 2022 at 14:47

2 Answers 2

0

For those looking for a similar solution, I'll post the general idea on how I solved this problem, thanks to the suggestion from @mark-kromer-msft.

enter image description here

First of all, the portion of my pipeline where I obtained the JSON files is not included. For that, I had to use an Until loop to paginate through this particular endpoint in order to obtain all submission results. I used a Copy Data activity to create JSON files in blob storage for each page. After that, I created a Data Flow.

I had to first flatten the "submissions" array in order to separate each submission into a separate row. I then used Derived Column to pull out each answer to a separate column. Here's what that looks like:

enter image description here

Here's one example of an Expression:

find(submissions.answers, equals(#item.question_id, '1')).answer

Finally, I just had to create the mapping in the last step (Sink) in order to map my derived columns.

Sign up to request clarification or add additional context in comments.

Comments

0

An alternate approach would be to use the native JSON abilites of Azure SQL DB. Use a Stored Proc task, pass the JSON in as a parameter and shred it in the database using OPENJSON:

-- Submission level
-- INSERT INTO yourTable ( ...
SELECT
    s.surveyName,
    s.xcount,
    s.submissions    
FROM OPENJSON( @json )
WITH (
    surveyName  VARCHAR(50)     '$.surveyName',  
    xcount      INT             '$.count',
    submissions NVARCHAR(MAX) AS JSON 
) s
    CROSS APPLY OPENJSON( s.submissions ) so;


-- Question level, additional CROSS APPLY and JSON_VALUEs required
-- INSERT INTO yourTable ( ...
SELECT
    'b' s,
    s.surveyName,
    s.xcount,
    --s.submissions,
    JSON_VALUE ( so.[value], '$.id' ) AS id,
    JSON_VALUE ( so.[value], '$.created' ) AS created,
    JSON_VALUE ( a.[value], '$.question_id' ) AS question_id,
    JSON_VALUE ( a.[value], '$.answer' ) AS answer

FROM OPENJSON( @json )
WITH (
    surveyName  VARCHAR(50) '$.surveyName',  
    xcount      INT             '$.count',
    submissions NVARCHAR(MAX) AS JSON 
) s
    CROSS APPLY OPENJSON( s.submissions ) so
        CROSS APPLY OPENJSON( so.[value], '$.answers' ) a;

Results at submission and question level:

JSON results

Full script with sample JSON here.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.