-1

I have seen other answers on this matter, but all of them deal with a std::stringstream, or a temporary char or std::string array, various other sorts of external libraries, but I would like to try using only the fstream header, to try to read a file that has only numbers, both char and short, together with float, separated by commas, forming more than one lines of text; some may be arrays, or vectors. Example:

1,1.1,11.1,11
2,2.2,22.2,22
3,3.3,33.3,33
...

The order is known, since each line follows the variables from a struct. The number of lines may vary, but, for now, let's assume it is also known. Also for the sake of example, let's only consider this order, and these types:

int, double, double, int

Going along with a piece of code I have seen, I tried this simplistic (and, most probably, naive) approach:

int a, d;
double b, c;
char fileName {"file.txt"};
std::fstream fs {fileName};
if(!fs.is_open())
    // open with fs.out, write some defaults; this works, no need to mention
else
{
    char comma;
    while(fs.getline(fileName, 100, '\n'))
    {
        fs >> a >> comma >> b >> comma >> c >> comma >> d;
        std::cout << 2*a << ", " << 2*b << ", " << 2*c << ", " << 2*d << '\n';
    }
}

If the file has the three lines above, plus a terminating \n, it outputs this:

4, 4.4, 44.4, 44
6, 6.6, 66.6, 66
6, 6.6, 66.6, 66
*** stack smashing detected ***: <unknown> terminated
Aborted (core dumped)

If I add a \n at the beginning of the file, it outputs:

2, 2.2, 22.2, 22
4, 4.4, 44.4, 44
6, 6.6, 66.6, 66
6, 6.6, 66.6, 66

If I remove the last \n, it works as intended. I have a few questions:

  1. What else can I do when writing the file besides adding a beginning \n and not inserting a terminating one in order to work as intended?

  2. If the number of variables is longer, say 100 per line, what can I do to avoid going 'round the Earth with fs >> a >> c >> ...?

  3. If I only need to read a specific line, or only a few, one method would probably be counting the occurences of \n, or the lines, somehow. How could I do that?

(edit)

  1. Not lastly, as the title mentions, is it possible to do it without involving other headers, only with fstream (as it currently is, for example)?
18
  • You might want to go through your code again. It doesn't really make sense. Commented Apr 8, 2018 at 18:26
  • @FeiXiang Thank you for letting me know. I see I reversed two << in the fs >> ... line. Is there something else I am missing? (to not edit too many times) Commented Apr 8, 2018 at 18:29
  • There was some post which changed locale. Search for it Commented Apr 8, 2018 at 18:30
  • 1
    parsing csv file c++. Use that search string, you should stumble upon an answer which has csv_classification struct. Commented Apr 8, 2018 at 18:33
  • 1
    You are trying to extract a line and put it into the char array called fileName, basically discarding a line since you never read from fileName. Commented Apr 8, 2018 at 18:41

1 Answer 1

4

The order is known, since each line follows the variables from a struct. The number of lines may vary, but, for now, let's assume it is also known. Also for the sake of example, let's only consider this order, and these types:

int, double, double, int

If the number and order of the fields is known, then you can simply read with >> or getline using both the ',' or '\n' delimiter as required. While it is much wiser to use line-oriented input to read an entire line and then stringstream to parse the fields, there is no reason you can't do the same thing utilizing only fstream as you have indicated is your goal. It's not as elegant of a solution, but a valid one nonetheless.

Using the >> Operator

Your data has 4-fields, the first 3 are delimited by a comma, the final delimited by the newline. You can simply loop continually and read using the >> operator and test for fail() or eof() after each read, e.g.

#include <iostream>
#include <fstream>

#define NFIELD 4
#define MAXW 128

int main (int argc, char **argv) {

    int a, d;
    double b, c;
    char comma;

    std::fstream f (argv[1]);
    if (!f.is_open()) {
        std::cerr << "error: file open failed " << argv[1] << ".\n";
        return 1;
    }

    for (;;) {          /* loop continually */
        f >> a >> comma >> b >> comma >> c >> comma >> d;
        if (f.fail() || f.eof())   
            break;
        std::cout << 2*a << "," << 2*b << "," << 2*c << "," << 2*d << '\n';
        f.ignore (MAXW, '\n');
    }
    f.close();
}

Keeping a simple field counter n, you can use a simple switch statement based on the field number to read the correct value into the corresponding variable, and when all fields are read output (or otherwise store) all 4 values that make up your struct. (obviously you can fill each member at the time they are read as well). Nothing special is required, e.g.

#include <iostream>
#include <fstream>

#define NFIELD 4

int main (int argc, char **argv) {

    int a, d, n = 0;
    double b, c;
    char comma;

    std::fstream f (argv[1]);
    if (!f.is_open()) {
        std::cerr << "error: file open failed " << argv[1] << ".\n";
        return 1;
    }

    for (;;) {          /* loop continually */
        switch (n) {    /* coordinate read based on field number */
            case 0: f >> a >> comma; if (f.eof()) goto done; break;
            case 1: f >> b >> comma; if (f.eof()) goto done; break;
            case 2: f >> c >> comma; if (f.eof()) goto done; break;
            case 3: f >> d; if (f.eof()) goto done; break;
        }
        if (++n == NFIELD) {    /* if all fields read */
            std::cout << 2*a << "," << 2*b << "," << 2*c << "," << 2*d << '\n';
            n = 0;      /* reset field number */
        }
    }
    done:;
    f.close();
}

Example Input File

Using your provided sample input.

$ cat dat/mixed.csv
1,1.1,11.1,11
2,2.2,22.2,22
3,3.3,33.3,33

Example Use/Output

You obtain your desired output by simply doubling each field on output:

$ ./bin/csv_mixed_read dat/mixed.csv
2,2.2,22.2,22
4,4.4,44.4,44
6,6.6,66.6,66

(the output for both above is the same)

Using getline Delimited by ',' and '\n'

You can use a slight variation on the logic to employ getline. Here, you read the first 3 fields with f.getline(buf, MAXC, ','), and when the 3rd field is found, you read the final field with f.getline(buf, MAXC). For example,

#include <iostream>
#include <fstream>

#define NFIELD  4
#define MAXC  128

int main (int argc, char **argv) {

    int a = 0, d = 0, n = 0;
    double b = 0.0, c = 0.0;
    char buf[MAXC];

    std::fstream f (argv[1]);
    if (!f.is_open()) {
        std::cerr << "error: file open failed " << argv[1] << ".\n";
        return 1;
    }

    while (f.getline(buf, MAXC, ',')) { /* read each field */
        switch (n) {    /* coordinate read based on field number */
            case 0: a = std::stoi (buf); break;
            case 1: b = std::stod (buf); break;
            case 2: c = std::stod (buf); 
                if (!f.getline(buf, MAXC))  /* read d with '\n' delimiter */
                    goto done;
                d = std::stoi (buf);
                break;
        }
        if (++n == NFIELD - 1) {    /* if all fields read */
            std::cout << 2*a << "," << 2*b << "," << 2*c << "," << 2*d << '\n';
            n = 0;      /* reset field number */
        }
    }
    done:;
    f.close();
}

(note: unlike using the >> operator, when using getline as above, there can be no whitespace following each comma.)

Example Use/Output

The output is the same.

$ ./bin/csv_mixed_read2 dat/mixed.csv
2,2.2,22.2,22
4,4.4,44.4,44
6,6.6,66.6,66

Regardless whether you use something like the examples above, or stringstream, you will have to know the number and order of the fields. Whether you use a loop and if..else if..else or switch the logic is the same. You need some way of coordinating your read with the correct field. Keeping a simple field counter is about as simple as anything else. Look things over and let me know if you have further questions.

Sign up to request clarification or add additional context in comments.

6 Comments

My purpose is to generate a save file for the struct holding the input data and various other settings. There are different topics, each requiring the same struct. I can reinforce a rule of writing, such as csv, I can determine how many lines to save (nr. of topics), even though it will fill up in time as this will be refreshed after each use. That's why I said it will have a known number of lines and (ordered) values. I find the 1st example very appealing. I don't see on cppreference, but is there an eol(), too? Or maybe a way to count the \n (I'll probably have to use a char)?
There is no eol(), but f.getline (NCHARS, '\n') allows for the same test. You can always add a simple int n = 0; to the first example and just increment it at the end of each loop. That would count the lines for you. Give it a go. If you have any problems, drop another comment and I'm happy to help.
Most probably I'll just use all the lines, and fill the unused ones with default values. For finding a certain line I thought of having a string with {"one", "two", ...} at the beginning of each line, but I think that will be cumbersome to have to check against strings. Or, looking at cppreference fstream.ignore(), I see I can use that as a way to determine the occurences of \n, and make a separate function with a loop. Maybe not beautoful, but working (for now). (Just saw your comment) Thanky you for the help. I'll mark this as the answer, but I may return with comments. :-)
Sure, that works, but anytime you have the choice between using a simple counter -- or using some other function -- take the counter route and avoid the overhead of a separate function call. (it's minimal, but can add up over a large application...) Also, instead of some number NCHARS in ignore, the proper constant is to include <limits> and use std::numeric_limits<int>::max() (which is INT_MAX) for the number of characters to ignore.
Then I think a not very bad approach would be to combine them. I can count chars until \n, that will be my MAXC, use it to getline() and >> directly in the variables. If I need to only perform actions on certain lines, I can skip to a different \n based on an enum, or even a string array. Maybe I can even do it without the help of an intermediate char array, or stringstream, as in your examples. So I can use a loop and avoid switch. It starts looking better, and there I thought I would be flayed alive for daring ask such a question, against "the books", by the looks of it.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.