5

I'd really like to convert my org.apache.spark.mllib.linalg.Matrix to org.apache.spark.mllib.linalg.distributed.RowMatrix

I can do it as such:

val xx = X.computeGramianMatrix()  //xx is type org.apache.spark.mllib.linalg.Matrix
val xxs = xx.toString()
val xxr = xxs.split("\n").map(row => row.replace("   "," ").replace("  "," ").replace("  "," ").replace("  "," ").replace(" ",",").split(","))
val xxp = sc.parallelize(xxr)
val xxd = xxp.map(ar => Vectors.dense(ar.map(elm => elm.toDouble)))
val xxrm: RowMatrix = new RowMatrix(xxd)

However, that is really gross and a total hack. Can someone show me a better way?

Note I am using Spark version 1.3.0

2 Answers 2

11

I suggest that you convert your Matrix to an RDD[Vector] which you can automatically convert to a RowMatrix later.

So, let's consider the following example :

import org.apache.spark.rdd._
import org.apache.spark.mllib.linalg._


val denseData = Seq(
  Vectors.dense(0.0, 1.0, 2.0),
  Vectors.dense(3.0, 4.0, 5.0),
  Vectors.dense(6.0, 7.0, 8.0),
  Vectors.dense(9.0, 0.0, 1.0)
)

val dm: Matrix = Matrices.dense(3, 2, Array(1.0, 3.0, 5.0, 2.0, 4.0, 6.0))

We wil need to define a method to convert that Matrix into an RDD[Vector] :

def matrixToRDD(m: Matrix): RDD[Vector] = {
   val columns = m.toArray.grouped(m.numRows)
   val rows = columns.toSeq.transpose // Skip this if you want a column-major RDD.
   val vectors = rows.map(row => new DenseVector(row.toArray))
   sc.parallelize(vectors)
}

and now we can apply that conversion on the main Matrix :

 import org.apache.spark.mllib.linalg.distributed.RowMatrix
 val rows = matrixToRDD(dm)
 val mat = new RowMatrix(rows)
Sign up to request clarification or add additional context in comments.

Comments

-1

small correction in above code: we need to use Vectors.dense instead of new DenseVector

val vectors = rows.map(row =>  Vectors.dense(row.toArray))

2 Comments

Is there a specific reason to use this over new DenseVector?
I'm not sure what is this about. What is the justification of this ? Why would you need that ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.