0

I am facing a problem reading my CSV file with Java.

The java code I am using to read the CSV file is:

package collectData;

import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;
import java.nio.charset.Charset;

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;

public class testCSV {

    public static void main(String[] args) throws IOException {
        String csvFilePath = "csv_files\\20240403_OV64E_A.csv";
        
        Charset encoding = Charset.forName("ISO-8859-1");

        try (Reader in = new FileReader(csvFilePath, encoding);
             CSVParser parser = CSVFormat.DEFAULT.withHeader().withDelimiter(';').parse(in)) {
            for (CSVRecord record : parser) {
                for (String headerName : parser.getHeaderNames()) {
                    System.out.print(headerName + ": " + record.get(headerName) +" ");
                }
                System.out.println(); 
            }
        }
    }
}

When I try to read the file as it is downloaded, I get this error message:

Exception in thread "main" java.lang.IllegalArgumentException: A header name is missing in [Ref file, Version file, state file, App name, App version, APPS Classification, Apps release date, Data ref, Data version, DATA Classification, ]
    at org.apache.commons.csv.CSVParser.createHeaders(CSVParser.java:509)
    at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:438)
    at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:404)
    at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:1781)
    at collecteRCD.testCSV.main(testCSV.java:28)

When I open the CSV file and I save it without modifying anything, the program works perfectly.

I tried to read the file with a text editor before and after I save it, and this is what changes:

Before:

Ref file;Version file;state file;App name;App version;APPS Classification;Apps release date;Data ref;Data version;DATA Classification;
003298;4.1;Figée;;;;;
003798;6.1;Figée;;;;;

After:

Ref file;Version file;state file;App name;App version;APPS Classification;Apps release date;Data ref;Data version;DATA Classification
3298;4.1;Figée;;;;;;;;
3798;6.1;Figée;;;;;;;;

I need to know a method to correct the CSV file automatically, because I am trying to build an automation app and it's pointless to keep correcting the files manually by opening them and saving them.

I looked online for some solutions but found nothing.

2
  • 2
    Well, there is a change. The second version doesn't have that trailing ; Commented Apr 3, 2024 at 7:52
  • 1
    How are you opening and saving the file? Because it is definitely not with a text editor (or not just a text editor) if it actually modifies your CSV to be "correct". Commented Apr 3, 2024 at 8:53

2 Answers 2

1

The original csv file is missing a header; the content only has 8 fields, yet there are 11 headers.

You need to make the following changes:

  1. Add a header;
  2. Add missing fields to the content until there are 11(3 more ;).

The following csv can be read normally(I tested it):

Ref file;Version file;state file;App name;App version;APPS Classification;Apps release date;Data ref;Data version;DATA Classification;A Header
003298;4.1;Figée;;;;;;;;
003798;6.1;Figée;;;;;;;;
Sign up to request clarification or add additional context in comments.

Comments

0

The issue you're facing is because the downloaded CSV file likely has a Byte Order Mark (BOM) at the beginning, which is causing the parsing to treat the first line (containing the headers) as having an empty field at the start. This throws off the header parsing logic in Apache Commons CSV.

If your downloaded file always has a BOM mark, either you can save it programmatically as UTF-8 or even better just skip the first during parsing by doing parser.nextRecord();

4 Comments

Hello, i tried to skip the first line but the problem is still
That is not the problem with the CSV file: it has a trailing unnamed column, and the number of columns in the header doesn't match the number of columns in the rows.
Well, when the OP saves the CSV without changing anything, it works. So that leads me to believe about the BOM being present and removed when re-saved.
@VHS Except they show something is changed in the CSV after saving: the trailing empty column is removed from the header, and the rows get additional empty columns.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.