Here's my situation
I have Azure Table as a source, and my target is Azure SQL database. Source table looks something like this:
| id | file_name | metadata |
|---|---|---|
| 1 | file_1.txt | {"Company": { "id": 555, "Name": "Company A" }, "quality": [{"quality": 3, "file_name": "file_1.txt"}, {"quality": 4, "file_name": "unkown"}]} |
| 2 | file_2.txt | {"Company": { "id": 231, "Name": "Company B" }, "quality": [{"quality": 4, "file_name": "file_2.txt"}, {"quality": 3, "file_name": "unkown"}]} |
| 3 | file_3.txt | {"Company": { "id": 111, "Name": "Company C" }, "quality": [{"quality": 5, "file_name": "unknown"}, {"quality": 4, "file_name": "file_3.txt"}]} |
The target table is supposed to look like this:
| id | file_name | company | quality |
|---|---|---|---|
| 1 | file_1.txt | Company A | 3 |
| 2 | file_2.txt | Company B | 4 |
| 3 | file_3.txt | Company C | 4 |
That means that I need to parse the data from this string to get the new column values, as well as use quality value depending on the file_name column from the source.
The first thing I've done is created a Copy pipeline to transfer the data 1 to 1 from Azure Tables to parquet file on Azure Data Lake Store so I can use it as a source in Data Flow. Next, the idea was to use derived column and use some expression to get the data but as far as I can see, there's no expression that treats this string as a JSON object.
So, the next idea was to maybe add a step before this process where I would extract the contents of metadata column to a separate file on ADLS and use that file as a source or lookup and define it as a JSON file to begin with. This would imply that I need to add id value to the JSON file so I'm able to tie the data back to the record.
The first solution looked more promising as the idea, but if that's not an option, I'll look into other possible solutions.
Thanks







