0

I am following code from,Pascal Bugnion book Scala for Data Science. First class to represent transaction

case class Transaction(
id:Option[Int], // unique identifier
candidate:String, // candidate receiving the donation
contributor:String, // name of the contributor
contributorState:String, // contributor state
contributorOccupation:Option[String], // contributor job
amount:Long, // amount in cents
date:Date // date of the donation
)

defined class Transaction

Then I have loaded dat with help of FEData singleton object

scala> val ohioData = FECData.loadOhio
ohioData: FECData = FECData@7e83a375

FECData object has attribute transactions

scala> val ohioTransactions = ohioData.transactions
ohioTransactions: Iterator[Transaction] = non-empty iterator

When I try to print first 5 transactions

scala> ohioTransactions.take(5).foreach(println)
java.text.ParseException: Unparseable date: "06-DEC-11"
  at java.text.DateFormat.parse(DateFormat.java:366)
  at FECData$$anonfun$1.apply(FECData.scala:26)
  at FECData$$anonfun$1.apply(FECData.scala:16)
  at scala.collection.Iterator$$anon$11.next(Iterator.scala:370)

Let's take a look at the first 5 lines of the csv file candidate_id,candidate,contributor_name,contributor_state,contributor_occupation,amount,date

P80000748,"Paul, Ron","BROWN, TODD W MR.",OH,ENGINEER,50.0,06-DEC-11
P80000748,"Paul, Ron","DIEHL, MARGO SONJA",OH,RETIRED,25.0,06-DEC-11
P80000748,"Paul, Ron","KIRCHMEYER, BENJAMIN",OH,COMPUTER PROGRAMMER,201.2,06-DEC-11
P80003338,"Obama, Barack","KEYES, STEPHEN",OH,HR EXECUTIVE / ATTORNEY,100.0,30-SEP-11
P80003338,"Obama, Barack","MURPHY, MIKE W",OH,MANAGER,50.0,26-SEP-11

Why?

6
  • Unparseable date most likely means that your date format is not recognized by the formatter. Commented Feb 21, 2017 at 17:02
  • Have a look at the exception and you'll see 2 things: DateFormat.parse(...) is throwing an exception and the message says Unparseable date: "06-DEC-11". This indicates the date doesn't match the date format being used so check which is used and either adjust the format or the date. Commented Feb 21, 2017 at 17:03
  • @Thomas Take a look at my edit,first lines of the file are shown! Commented Feb 21, 2017 at 17:08
  • Just a guess, your formatter expects Dec with only the first letter in uppercase? Commented Feb 21, 2017 at 17:13
  • Well what should that tell me? There's the date that can't be parsed but what should I look at it for? You'll need to know which format is expected and adjust accordingly. Note that if your formatter actually expects the month by short name (i.e. DEC) the Locale matters as well. That means if your formatter uses a locale other than English chances are high that the short names differ (e.g. DEZ in German) and thus the name is not recognized. Commented Feb 21, 2017 at 17:14

1 Answer 1

3

Ok, the problem is that in the FECData is defined a dateParser as new SimpleDateFormat("DD-MMM-YY").

According to https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#SimpleDateFormat(java.lang.String), it constructs a SimpleDateFormat using the given pattern and the default date format symbols for the default locale.

The problem is that your default locale (of your JVM) is not Locale.ENGLISH and so the DEC part of "06-DEC-11" is not parsed correctly.

You just need to patch the FECData: replace private val dateParser = new SimpleDateFormat("DD-MMM-YY") with private val dateParser = new SimpleDateFormat("DD-MMM-YY", java.util.Locale.ENGLISH).

Ref. for Localehttps://docs.oracle.com/javase/7/docs/api/java/util/Locale.html

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.