0

I'm new to scala/spark and am trying to loop through a dataframe and assign the results as the loop progresses. The following code works but can only print the results to screen.

traincategory.columns.foreach { x=>

val test1 = traincategory.select("Id", x)

import org.apache.spark.ml.feature.{OneHotEncoder, StringIndexer}

//CODE TO PERFORM ONEHOT TRANSFORMATION

val encoded = encoder.transform(indexed)

encoded.show()

}

As val is immutable I have attempted to append the vectors from this transformation onto another variable, as might be done in R.

//var ended = traincategory.withColumn(x,encoded(0))

I suspect Scala has a more idiomatic way of processing this.

Thank you in advance for your help.

1

1 Answer 1

0

A solution was available at :

https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/mllib/Correlations.scala

If anyone has similar issues with Scala MLIB there is great example code at :

https://github.com/apache/spark/tree/master/examples/src/main/scala/org/apache/spark/examples/mllib

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.