0

I am trying to convert a JSON object to a map object in Hive using the brickhouse lib `brickhouse.udf.json.FromJsonUDF``.

The problem is, that my json object contains different types of values: string and one array of another arrays.

My json looks like this:

'{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}'

I can either read correctly the array of arrays element (key4) using the following:

select from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}', 'map<string,array<array<string>>>') from my_table limit 1;

Which gives me:

{"key1":[],"key3":[],"key2":[],"key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}

As you can see all the elements but key4 are empty.

Or I am able to read other elements but key4 using:

select from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}', 'map<string,string>') from my_table limit 1;

Which gives me:

{"key1":"value1","key3":"value3","key2":"value2","key4":null}

But how can I convert all the elements correctly to key-value pairs on the resulting map object?

EDITED:

My actual data is an array of two components which are json objects:

[{"key1":"value1", "key2":"value2"}{"key3":"value3","key4":"value4","key5":"value5","key6":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}]

Is it possible to create a struct object which contains the two json objects as two map objects so that I can access the first or second struct element and then select the value of the correspnding map object using a key?

For example: assuming my desired endresult is called struct_result I would access value1 from the first component like:

struct_result.t1["key1"]

which would give me "value1".

Is it possible to achieve this with this lib?

1 Answer 1

1

This can be done using named_structs. You need to create a named_struct, and specify the types for each of the keys independently.

For example

select_from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}',
    named_struct("key1","", "key2", "", "key3", ""
        "key4", array(array("")))
from my_table;

This creates a template object using the 'named_struct' UDF, or you can use the equivalent string type definition.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.