I'm trying to convert a string[][] into a Dataset<Row> column consisting of string[].
I have gone through the docs and available examples online but could not find something similar to this. I don't know if its possible or not as I'm a complete beginner in spark.
Sample input:
String[][] test = {{"test1"}, {"test2", "test3"}, {"test4", "test5"}};
Sample output:
Dataset<Row> test_df
test_df.show()
+-------------+
| foo|
+-------------+
| [test1]|
|[test2,test3]|
|[test4,test5]|
+-------------+
I'm probably defining the structType wrong for string[][], I've tried different ways too. Here's what I'm trying to do:
String[][] test = {{"test1"}, {"test2", "test3"}, {"test4", "test5"}};
List<String[]> test1 = Arrays.asList(test);
StructType structType = DataTypes.createStructType(
DataTypes.createStructField(
"foo",
DataTypes.createArrayType(DataTypes.StringType),
true));
Dataset<Row> t = spark.createDataFrame(test1, structType);
t.show();