1

I have a hive table that looks like the following:

id | value
 1 | ['0', '0', '1', '0', '1', '1', '0', '0']
 2 | ['2', '0', '3', '0', '3', '1', '2', '1']

I want the result to be the following:

id | value
 1 | [0,0,1,0,1,1,0,0]
 2 | [2,0,3,0,3,1,2,1]

I need to convert them into an array of float so that I can use them in ST_Constains(ST_MultiPolygon(), st_point()) to determine if a point is in an area.

I am new to Hive, not sure if that is possible, any help would be very appreciated.

1 Answer 1

2

You can explode array, cast value, collect array again. Demo:

with your_table as(
select stack(2,
 1 , array('0', '0', '1', '0', '1', '1', '0', '0'),
 2 , array('2', '0', '3', '0', '3', '1', '2', '1')
 ) as (id,value)
 ) --use your_table instead of this


 select s.id, 
        s.value                            as original_array, 
        collect_list(cast(s.str as float)) as array_float 
 from
(select t.*, s.* 
 from your_table t
               lateral view outer posexplode(t.value)s as pos,str       
   distribute by t.id, t.value 
         sort by s.pos --preserve order in the array
 )s  
group by s.id, s.value;  

Result:

OK
1       ["0","0","1","0","1","1","0","0"]       [0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0]
2       ["2","0","3","0","3","1","2","1"]       [2.0,0.0,3.0,0.0,3.0,1.0,2.0,1.0]

See also this answer about sorting array in the query https://stackoverflow.com/a/57392965/2700344

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks very much. This is what exactly what I want. I takes me a long time to understand every steps in your code. None of the line is useless. I am appreciated.
@GuanyanLin You are welcome! If it helps, please accept/vote

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.