Problem when reading a CSV file using Java

Question

I am facing a problem reading my CSV file with Java.

The java code I am using to read the CSV file is:

package collectData;

import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;
import java.nio.charset.Charset;

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;

public class testCSV {

    public static void main(String[] args) throws IOException {
        String csvFilePath = "csv_files\\20240403_OV64E_A.csv";
        
        Charset encoding = Charset.forName("ISO-8859-1");

        try (Reader in = new FileReader(csvFilePath, encoding);
             CSVParser parser = CSVFormat.DEFAULT.withHeader().withDelimiter(';').parse(in)) {
            for (CSVRecord record : parser) {
                for (String headerName : parser.getHeaderNames()) {
                    System.out.print(headerName + ": " + record.get(headerName) +" ");
                }
                System.out.println(); 
            }
        }
    }
}

When I try to read the file as it is downloaded, I get this error message:

Exception in thread "main" java.lang.IllegalArgumentException: A header name is missing in [Ref file, Version file, state file, App name, App version, APPS Classification, Apps release date, Data ref, Data version, DATA Classification, ]
    at org.apache.commons.csv.CSVParser.createHeaders(CSVParser.java:509)
    at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:438)
    at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:404)
    at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:1781)
    at collecteRCD.testCSV.main(testCSV.java:28)

When I open the CSV file and I save it without modifying anything, the program works perfectly.

I tried to read the file with a text editor before and after I save it, and this is what changes:

Before:

Ref file;Version file;state file;App name;App version;APPS Classification;Apps release date;Data ref;Data version;DATA Classification;
003298;4.1;Figée;;;;;
003798;6.1;Figée;;;;;

After:

Ref file;Version file;state file;App name;App version;APPS Classification;Apps release date;Data ref;Data version;DATA Classification
3298;4.1;Figée;;;;;;;;
3798;6.1;Figée;;;;;;;;

I need to know a method to correct the CSV file automatically, because I am trying to build an automation app and it's pointless to keep correcting the files manually by opening them and saving them.

I looked online for some solutions but found nothing.

Well, there is a change. The second version doesn't have that trailing ; — Federico klez Culloca
– Federico klez Culloca, Commented Apr 3, 2024 at 7:52
How are you opening and saving the file? Because it is definitely not with a text editor (or not just a text editor) if it actually modifies your CSV to be "correct". — Mark Rotteveel
– Mark Rotteveel, Commented Apr 3, 2024 at 8:53

Yaohui Wu · Accepted Answer · 2024-04-03 08:24:17Z

1

The original csv file is missing a header; the content only has 8 fields, yet there are 11 headers.

You need to make the following changes:

Add a header;
Add missing fields to the content until there are 11(3 more ;).

The following csv can be read normally(I tested it):

Ref file;Version file;state file;App name;App version;APPS Classification;Apps release date;Data ref;Data version;DATA Classification;A Header
003298;4.1;Figée;;;;;;;;
003798;6.1;Figée;;;;;;;;

answered Apr 3, 2024 at 8:24

Yaohui Wu

1051 silver badge8 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

VHS · Accepted Answer · 2024-04-03 07:58:39Z

0

The issue you're facing is because the downloaded CSV file likely has a Byte Order Mark (BOM) at the beginning, which is causing the parsing to treat the first line (containing the headers) as having an empty field at the start. This throws off the header parsing logic in Apache Commons CSV.

If your downloaded file always has a BOM mark, either you can save it programmatically as UTF-8 or even better just skip the first during parsing by doing parser.nextRecord();

answered Apr 3, 2024 at 7:58

VHS

10.2k3 gold badges25 silver badges48 bronze badges

4 Comments

Mouad Thf Over a year ago

Hello, i tried to skip the first line but the problem is still

Mark Rotteveel Over a year ago

That is not the problem with the CSV file: it has a trailing unnamed column, and the number of columns in the header doesn't match the number of columns in the rows.

VHS Over a year ago

Well, when the OP saves the CSV without changing anything, it works. So that leads me to believe about the BOM being present and removed when re-saved.

Mark Rotteveel Over a year ago

@VHS Except they show something is changed in the CSV after saving: the trailing empty column is removed from the header, and the rows get additional empty columns.

Collectives™ on Stack Overflow

Problem when reading a CSV file using Java

2 Answers 2

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related