0

I have a csv file with two columns like this:

column1                             column2
sachin@@@tendulkar@@@Ganguly       cricket@@@India@@@players

I want to convert it to a hash map which would be like this:

sachin-> "cricket, India, players"
tendulkar-> "cricket, India, players"
Ganguly-> "cricket, India, players"

cricket, India, players this should be a one string. How can I get it done in scala? This is what I have done so far

val csv = sc.textFile("/tag/players.csv")  
val headerAndRows = csv.map(line => line.split(",").map(_.trim))
val header = headerAndRows.first()  
val synonyms = csv.map(_.split(",")).map( p=>(p(1))  // for column1
val targettag = csv.map(_.split(",")).map(p=>p(2))   // for column2
val splitsyno = synonyms.map(x => x.split("@@@"))
val splittarget = targettag.map(x=>x.split("@@@"))

I want to know how to proceed forward to create the desired hashmap?

3
  • What problem do you have? You unable to read file? Unable to parse it as CSV? Unable to split strings? Unable to combine them with separator you want? Unable to put them in Map? Commented Oct 28, 2016 at 17:01
  • It is comma seperated. val targettag = csv.map(_.split(",")).map(p=>p(2)) val splitsyno = targettag.map(x => x.split("@@@")) I have split the strings but am not able to create desired hashmap Commented Oct 28, 2016 at 17:04
  • Put your code in your question and specify what part is causing the problem. Commented Oct 28, 2016 at 17:07

3 Answers 3

1

That code works for a single line. After that you can merge all lines if you want to. I've hardcoded the provided row.

First it splits the data into a tuple. Step2 is replacing the '@@@' of column2 with ','. Step3 is splitting 'column1' at '@@@' and map it to a tuple as element of a Map and then convert it to a map.

You can quite optimize the solution.

val data = "sachin@@@tendulkar@@@Ganguly, cricket@@@India@@@players"

val (c1:String, c2:String) = data.split(",") match {
  case Array(a, b) => (a,b)
}
val c2s = c2.replace("@@@", ",")
val xx = c1.split("@@@").map(_ -> c2s).toMap

// Just to validate the ouput
xx.foreach(f => println(f._1 + "->" + f._2))
Sign up to request clarification or add additional context in comments.

11 Comments

Hey, I have to read this csv to sparkrdd first. For which I did: val data = csv.map(_.split(",")).map( p=>p(1)+"###"+p(2)) ### because there commas inside column entries itself. now when I am doing this val (c1:String, c2:String) = data.take(totalcountof1).map(x=>x.split("###")) match {case Array(a, b) => (a,b)}) this throws an error: found : String required: Array[String]
Look at your parenthesis. You are applying the match to the result of take(). You should apply it to x.split('###') match {...})
Thanks for your response. But I am getting a new error as : found : (T1, T2) required: Array[(String, String)]
val (c1:String, c2:String) = data.take(totalcountof1).map(x=>x.split("###") match {case Array(a, b) => (a,b)}) this should work....
with the exact above command the error is this: error: constructor cannot be instantiated to expected type; found : (T1, T2) required: Array[(String, String)]
|
1

I would do the following (note, I'm skipping over mapping a row of a csv to extract column1 and column2):

val synonyms = List("sachin", "tendulkar", "Ganguly")
val target = "cricket@@@India@@@players".replaceAll("@@@", ", ")
for(s <- synonyms) yield s -> target

Comments

0

This code works but is very un"functional"

import scala.collection.mutable.Map

object parseData {
  val inputSource = io.Source.fromFile("<input-file>")
  val dataMap = Map.empty[String, String]
  for (line <- inputSource.getLines.drop(1)) {
    val keysAndValues = line.split(" +")
    val keys = keysAndValues(0).split("@@@")
    val values = keysAndValues(1).split("@@@").mkString(", ")
    for (k <- keys) {
      dataMap += (k -> values)
    }
  }
  inputSource.close
}

We can probably idiomize a bunch of things. But is a good enough start. The drop(1) is to not include column1 and column2 in the dataMap.

Output:

Map(tendulkar -> cricket, India, players, Ganguly -> cricket, India, players, sachin -> cricket, India, players)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.