1

I am working on a program in C to parse data from a CSV file. The structure of my CSV file includes fields that may be empty, and I need to handle these cases properly. Here's an example of a CSV row:

123,John,Doe,,New York
124,Mary,,01/01/2000,Los Angeles
125,,,,

I want to store the parsed data in an array of structs :

typedef struct {
    int ID;
    char FirstName[25];
    char LastName[25];
    char dateOfBirth[15];
    char cityOfBirth[25];
    int del;
} Student;

I attempted to parse the CSV file using sscanf like this:

int read = sscanf(line, "%d,%24[^,],%24[^,],%14[^,],%24[^,\n]", 
                  &student.ID, 
                  student.FirstName, 
                  student.LastName, 
                  student.dateOfBirth, 
                  student.cityOfBirth);  
student.del = 0;  

if (read != 5) {
    printf("Error reading line or handling empty fields.\n");
    return;
}

However, when a field is empty (e.g., ,,), fscanf doesn't work as I expected. It either skips empty fields or fails to parse the line correctly.

To work around this, I tried using strtok to tokenize the CSV row after reading it as a string:

char delimiters[] = ",";
char line[256];
fgets(line, sizeof(line), csvFile);

char *token = strtok(line, delimiters);
if (token != NULL) student.ID = atoi(token);
else student.ID = 0;  // Handle empty ID

token = strtok(NULL, delimiters);
if (token != NULL) strcpy(student.FirstName, token);
else strcpy(student.FirstName, "");  // Handle empty field

This partially works, but strtok skips consecutive delimiters, so I cannot reliably detect and handle empty fields.

2
  • 1
    Checking for commas alone won't work because a comma may appear in a text string, so of course a comma does not constitute a new field in that instance. So, this is kind of a more logic question than C perhaps? At any rate, scanf() will never succeed, imo, because you absolutely must read the CSV file a character at a time -- or at least parse the read buffer a character at a time -- to catch all the nuance of the CSV format. I recommend 1st obtaining the rule/logic documents of CSV encoding, then build your code from that. Commented Nov 30, 2024 at 15:55
  • 1
    either scan the input as @gregspears mentions or check online for one of a number of c csv parsers - eg github.com/rgamble/libcsv Commented Nov 30, 2024 at 17:01

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.