0

I am using spark-sql-2.4.1v with Java 8 in my project.

I need to construct a loop up hashmap from given dataframe as below:

List ll = Arrays.asList(
      ("aaaa", 11),
      ("aaa", 12),
      ("aa", 13),
      ("a", 14)
    )

Dataset<Row> codeValudeDf = ll.toDF( "code", "value")

Given the above dataframe I need to create a hashmap

i.e.

Map<String, String> lookUpHm = new Hashmap<>();

lookUpHm  => aaaa->11  , aaa->12 , aa->13, a->14

How can it it be done in Java?

2 Answers 2

2

Try this-

 List<Row> rows = Arrays.asList(
                RowFactory.create("aaaa", 11),
                RowFactory.create("aaa", 12),
                RowFactory.create("aa", 13),
                RowFactory.create("a", 14)
        );

        Dataset<Row> codeValudeDf = spark.createDataFrame(rows, new StructType()
                .add("code", DataTypes.StringType, true, Metadata.empty())
                .add("value", DataTypes.IntegerType, true, Metadata.empty()));
        Map<String, Integer> map = new HashMap<>();
        codeValudeDf.collectAsList().forEach(row -> map.put(row.getString(0), row.getInt(1)));

        System.out.println(map.entrySet().stream().map(e -> e.getKey() +"->"+ e.getValue())
                .collect(Collectors.joining(", ", "[ ", " ]")));
        // [ aaa->12, aa->13, a->14, aaaa->11 ]
Sign up to request clarification or add additional context in comments.

1 Comment

if dataset is huge, codeValudeDf.collectAsList() will collect data in driver, that would be an issue. We want it to be distributed and map should be prepared and collected from distributed datasets. How to do it/
1

Simple add a new column of type map using withColumn and do a collect on your dataframe.

codeValudeDf.withColumn("some_map",
map(col("code"), col("value"))).select("some_map").distinct().collect()

1 Comment

getting error org.apache.spark.sql.AnalysisException: Cannot have map type columns in DataFrame which calls set operations(intersect, except, etc.), but the type of column some_map is map<string,int>;;

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.