I've the Hive table data, i would need some help on transforming the data into a below shown "Expected data form".
Two things on the note:
- omit the columns which are showing null, like omit the columns abs, ada, adw on the first row.
- for those columns which are of type array(ex: abs, ada, adw, alt) & value is not null, include the column name within the array as shown in the Expected data form and keep an outer column name called EVENTS.
Is there a way i can achieve this using Spark-sql, or do i need write some scala UDF. I need the solution in spark-scala. Any help would be much appreciated.
Hive Table data:
_______________________________________________________________________________________________________
| vin | tt |msg_type | abs |ada | adw | alt |
|___________________|_____|_________|_______|________________________|_______|__________________________|
| FU7XXXXXXXXXXXXXX | 0 |SIGNAL | (null)|(null) | (null)|[{"E":15XXXXXXXX,"V":0.0}]|
|__________________ |_____|_________|______ |________________________|_______|__________________________|
| FSXXXXXXXXXXXXXXX | 0 |SIGNAL | (null)|[{"E":15XXXXXXXX,"V":1}]| (null)| (null) |
|___________________|_____|_________|_______|________________________|_______|__________________________|
Expected data:
_______________________________________________________________________
| vin | tt |msg_type | EVENTS |
|___________________|_____|_________|______________________________________|
| FU7XXXXXXXXXXXXXX | 0 |SIGNAL | [{"SN":"alt","E":15XXXXXXXX,"V":0.0}]|
|__________________ |_____|_________|______ _______________________________|
| FSXXXXXXXXXXXXXXX | 0 |SIGNAL | [{"SN":"ada","E":15XXXXXXXX,"V":1}] |
|___________________|_____|_________|______________________________________|