3

Using Apache Commons CSV for parsing, but doesn't ignore missing column and throws exception.

with this sample data:

name age
Ali 35
John 25
Vahid 75

Below code record.get(DataColumns.surname) throws java.lang.IllegalArgumentException: Mapping for surname not found, expected one of [name, surname, age]. I need it returns null, optional or default value. Is there any option? I know it is possible with record.toMap().get(DataColumns.surname.name()) but its performance will not be good:

...
enum DataColumns { name, surname, age }
...
Reader in = new BufferedReader(new FileReader(fileName));

try (CSVParser records = CSVFormat.TDF
                .withDelimiter(' ')
                .withIgnoreSurroundingSpaces()
                .withAllowDuplicateHeaderNames(false)
                .withIgnoreHeaderCase()
                .withTrim()
                .withHeader(DataColumns.class)
                .withFirstRecordAsHeader()
                .withSkipHeaderRecord()
                .withAllowMissingColumnNames(false)
                .withIgnoreEmptyLines()
                .parse(in)) {

   for (CSVRecord record : records) {
       String name = record.get(DataColumns.name);
       String surname = record.get(DataColumns.surname);
       Short age = Short.valueOf(record.get(DataColumns.age)); 
   }
}

...

2 Answers 2

1

You might try using record.isMapped(columnName) to check if the column exists, recording into a variable so you don't have to check again every line.

Another option would be to use records.getHeaderNames() and store it into a variable once, before the loop, maybe even using a Set<String> for an extra kick of existance checking performance: Set<String> headerNames = new HashSet<>(records.getHeaderNames()).

Then, you can use the resulting variable inside the loop by calling headerNames.contains(columnName) to check whether the column exists or not.

Plese, see: https://javadoc.io/doc/org.apache.commons/commons-csv/latest/org/apache/commons/csv/CSVRecord.html

Sign up to request clarification or add additional context in comments.

4 Comments

This is ok, but for more columns maybe it will decrease performance
One workaround would be to store its return value into a variable on the first record
Check only one time?
Exactly; you may get missing cells between one row or the other, but the column names will stay the same
0

There is method: record.get(String) while you gave enum instead.

Try record.get(DataColumns.name.name())

2 Comments

Please try to explain your answer.
They are same thing, overload method String get(Enum<?> e) calls String get(String name)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.