I have a very big gz zipped file of JSON data. Due to some limitations, I am not able to extract and transform the data. Now the JSON data itself is very dynamic in nature.
For example:
{ name: 'yourname', 'age': 'your age', schooling: {'high-school-name1': 'span of years studied'}}
{ name: 'yourname', 'age': 'your age', schooling: {'high-school-name2': 'span of years studied'}}
The problem is the high-school-name field is a dynamic one, which will be different for different sets of users.
Now when I am uploading to bigquery, I am not able to determine which type I should specify for the schooling field or how to handle this upload to bigquery.
I am using Cloud function to automate the flow, so as soon as the file is uploaded to Cloud Storage it will trigger the function. As the cloud function has very low memory storage, there is no way to transform the data there. I have looked into dataprep for it, but I am trying to understand if I am missing something which could make what I am trying to do possible without using any other services.
high-school-name?high-school-nameXXX, I mean, to have a unique column name (high-school-name) with the value in itspan of years studied? Which result do you wish/expect? Can you update your question with this detail?