2

After I load a json file with:

df = sqlContext.read().json(path);

I will get my DataFrame in Java Spark. I have for example the next DF:

id item1 item2 item3 ....
id1    0     3     4
id2    1     0     2
id3    3     3     0
...

I want to transform it in the most easy way to (probably of Object of the class Rating, id and item then to Integer by .hashCode())

id   item   ranking
id1  item1    0
id1  item2    3
id1  item3    4
....
id2  item1    1
id2  item2    0
id1  item1    2
...

PS Some first attempt to create the flatMap function:

void transformTracks() {
        JavaRDD<Rating> = df.flatMap(new Function<Row, Rating>(){
            public Rating call(Row r) {
                for (String i : r) {
                    return Rating(1, 1, r.apply(Double.parseDouble(i)));
                }
            }
        })
    }
2
  • 1
    I'm thinking flatMap will do the trick? Commented Feb 24, 2016 at 14:23
  • @Glennie Helles Sindhoit, sorry, I'm new in Java Spark, can you please show it on example? Commented Feb 24, 2016 at 14:28

1 Answer 1

2

You have to forgive me if the syntax is slightly off - I program in Scala nowadays and it's been a while since I used Java - but something along the lines of:

DataFrame df = sqlContext.read().json(path);
String[] columnNames = df.columns;

DataFrame newDF = df.flatMap(row -> {
  ArrayList list = new ArrayList<>(columnNames.length);
  String id = (String)row.get(0);

  for (int i = 1; i < columnNames.length, i++) {
    list.add(id, columnNames[i], (int)row.get(i));
  }
  return list;
}).toDF("id", "item", "ranking");
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, I try to understand the flat map function right now, will update my solution above.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.