1

Lately, I've been learning about spark sql, and I wanna know, is there any possible way to use mllib in spark sql, like :

select mllib_methodname(some column) from tablename; 

here, the "mllib_methodname" method is a mllib method. Is there some example shows how to use mllib methods in spark sql?

Thanks in advance.

3
  • 2
    Currently I don't think , SQL is mainly meant for dataware housing and pre processing the data , you can surely build the dataset using SQL and then run in MLlib , but I couldn't find the other way around Commented Jun 25, 2015 at 14:50
  • I think I can customize the function in the sql to call method in MLlib Commented Jun 26, 2015 at 10:42
  • That will be great and you may check spark buglist , if its not there you may contribute Commented Jun 26, 2015 at 11:08

1 Answer 1

1

The new pipeline API is based on DataFrames, which is backed by SQL. See

http://spark.apache.org/docs/latest/ml-guide.html

Or you can simply register the predict method from MLlib models as UDFs and use them in your SQL statement. See

http://spark.apache.org/docs/latest/sql-programming-guide.html#udf-registration-moved-to-sqlcontextudf-java--scala

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.