I have a hive table with following properties -
- ORC Storage Format
- transactional = true
- Partitioned on 4 keys - year, month, day, hour
- bucketed by groupingKey
I am using Hive Streaming for populating data directly into table.
Now my problem is - I am trying to run following query
select count(*) from table_name;
I am getting following exception
Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
at org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:295)
at org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.acidAddRowToBatch(VectorizedBatchUtil.java:275)
at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowReader.next(VectorizedOrcAcidRowReader.java:82)
However if I turn off vectorized execution by setting following property
set hive.vectorized.execution.enabled = false;
everything works fine (Although it takes ages to complete).
Why is this happening ? From what I understand, with ORC format, vectorized execution should work.
Hadoop Version - 2.7.1
Hive Version - 1.2.1
