The column has multiple usage of the delimiter in a single row, hence split is not as straightforward.
Upon splitting, only the 1st delimiter occurrence has to be considered in this case.
As of now, I am doing this.
However, I feel there can be a better solution?
testdf= spark.createDataFrame([("Dog", "meat,bread,milk"), ("Cat", "mouse,fish")],["Animal", "Food"])
testdf.show()
+------+---------------+
|Animal| Food|
+------+---------------+
| Dog|meat,bread,milk|
| Cat| mouse,fish|
+------+---------------+
testdf.withColumn("Food1", split(col("Food"), ",").getItem(0))\
.withColumn("Food2",expr("regexp_replace(Food, Food1, '')"))\
.withColumn("Food2",expr("substring(Food2, 2)")).show()
+------+---------------+-----+----------+
|Animal| Food|Food1| Food2|
+------+---------------+-----+----------+
| Dog|meat,bread,milk| meat|bread,milk|
| Cat| mouse,fish|mouse| fish|
+------+---------------+-----+----------+