7

I have a question for Apache common CSVParser/CSVRecord. Take a look at the CSV file below:

Header1,Header2,Header3
"",,"L1C3"

CSVParser/CSVRecord is returning "" for the first two columns. In my case I want to distinguish empty string("") and null values. Is there a configuration I could set to let CSVParser to return null for the second column?

7
  • Possible duplicate of How to differentiate between empty string and null with OpenCSV Commented Jan 12, 2016 at 1:35
  • I'm not sure whether this is a dupe, though I saw the other article as well. Commented Jan 12, 2016 at 1:36
  • @TimBiegeleisen It is. You can see the exact same question. Commented Jan 12, 2016 at 1:36
  • Same question, different library, or not? Commented Jan 12, 2016 at 1:37
  • 1
    One hack is to pre-process your file and replace your ""s with something else. Then you can use withNullString on CVSFormat Commented Jan 12, 2016 at 1:41

4 Answers 4

9

I've used this format :

CSVFormat.RFC4180.withFirstRecordAsHeader()
   .withIgnoreSurroundingSpaces()
   .withNullString("")

Where the 2 configurations:

  1. ignore space - which trims any value on both sides, if its all space, it will be trimmed to empty space
  2. null string - which treats the empty spaces as null

Here's a sample usage:

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNull;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.StringReader;
import org.junit.Test;

public class CsvParseTest {

    @Test
    public void testParseWillTrimAndConvertToNull() throws Exception {
        String CSV_HEADER = "Name,MobileNo,Location";
        String CSV_ROW_1 = "abc,   ,australia"; // MobileNo is 3 whitespaces
        CSVParser parse = CSVFormat.RFC4180.withFirstRecordAsHeader().withIgnoreSurroundingSpaces().withNullString("")
                .parse(new BufferedReader(new StringReader(CSV_HEADER + "\n" + CSV_ROW_1)));

        CsvRecord rec = parse.getRecords().get(0);
        assertEquals("abc", rec.get("Name"));
        assertNull(rec.get("MobileNo"));
        assertEquals("australia", rec.get("Location"));
    }
}
Sign up to request clarification or add additional context in comments.

Comments

2

I think uniVocity-parsers is the only library that allows you to distinguish empty strings from nulls (I know this won't address your problem with Apache Commons CSV directly, but at least there's a way to get what you need).

Here's how to do it:

public static void main(String ... args){
    String input = "Header1,Header2,Header3\n" +
            "\"\",,\"L1C3\"";

    CsvParserSettings settings = new CsvParserSettings(); //many options here, check the tutorial.
    settings.setEmptyValue("I'm empty"); //value to use when the parser finds "". Set to "" to get an empty String.
    settings.setNullValue("I'm null"); //value to use when the parser finds a null value (i.e. ,,). 

    CsvParser parser = new CsvParser(settings);
    List<String[]> allRows = parser.parseAll(new StringReader(input));

    for(String[] row : allRows){
        System.out.println(Arrays.toString(row));
    }
}

This will produce the following output:

[Header1, Header2, Header3]
[I'm empty, I'm null, L1C3]

uniVocity-parsers is also 3 times faster than Apache Commons CSV and has way more features.

Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

Comments

1

At the end I didn't find a good solution to return null with Apache Commons CSV library. I switched to OpenCSV 3.6 and here is the code I used, which I also posted on another thread. Thanks to everyone else that suggested OpenCSV.

CSVReaderBuilder has withFieldAsNull() for this purpose.

CSVReader csvReader = new CSVReaderBuilder(csvFileReader)
    .withFieldAsNull(CSVReaderNullFieldIndicator.EMPTY_SEPARATORS)
    .build();

Comments

0

In Apache commons csv 1.2, we could use the CSVFormat class method withNullString() to convert null strings to NULL. Here null string could be "" or "N/A" or "Nill" according to your requirement.

CSVFormat csvFormat = CSVFormat.DEFAULT.withNullString("");
CSVParser csvParser = new CSVParser(fileReader, csvFormat);

This would give NULL, NULL, L1C3 for the given record in question.

Note: Empty records are automatically converted to empty strings thus resulting in converted to NULL value at last.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.