2

I have a struct and I would like to write it to a binary file (c++ / visual studio 2008). The struct is:

struct DataItem
{
  std::string tag;        
  std::vector<int> data_block;
  DataItem(): data_block(1024 * 1024){}
};

I am filling tha data_block vector with random values:

DataItem createSampleData ()
{   
    DataItem data;  
    std::srand(std::time(NULL));
    std::generate(data.data_block.begin(), data.data_block.end(), std::rand);   
    data.tag = "test";
    return data;
}

And trying to write the struct to file:

void writeData (DataItem data, long fileName)
{
    ostringstream ss;
    ss << fileName;
    string s(ss.str());
    s += ".bin";

    char szPathedFileName[MAX_PATH] = {0};
    strcat(szPathedFileName,ROOT_DIR);
    strcat(szPathedFileName,s.c_str());
    ofstream f(szPathedFileName, ios::out | ios::binary | ios::app);            
    // ******* first I tried to write this way then one by one  
    //f.write(reinterpret_cast<char *>(&data), sizeof(data));
    // *******************************************************
    f.write(reinterpret_cast<const char *>(&data.tag), sizeof(data.tag));
    f.write(reinterpret_cast<const char *>(&data.data_block), sizeof(data.data_block));
    f.close();
}

And the main is:

int main()
{
    DataItem data = createSampleData();
    for (int i=0; i<5; i++) {
         writeData(data,i); 
    }
}

So I expect a file size at least (1024 * 1024) * 4 (for vector)+ 48 (for tag) but it just writes the tag to the file and creates 1KB file to hard drive.

I can see the contents in while I'm debugging but it doesn't write it to file...

What's wrong with this code, why can't I write the strcut to vector to file? Is there a better/faster or probably efficient way to write it?

Do I have to serialize the data?

Thanks...

2
  • 1
    sizeof on a vector doesn't do what you want it to because it doesn't take into account the array. In fact it can't, because sizeof is evaluated at compile time and the array size is known only at runtime. What you're getting is the size of a few ints (size and capacity) and a pointer. Commented Jan 30, 2014 at 8:42
  • so what is the right way to write this struct to vector? Commented Jan 30, 2014 at 8:46

2 Answers 2

6

Casting a std::string to char * will not produce the result you expect. Neither will using sizeof on it. The same for a std::vector.

For the vector you need to use either the std::vector::data method, or using e.g. &data.data_block[0]. As for the size, use data.data_block.size() * sizeof(int).

Writing the string is another matter though, especially if it can be of variable length. You either have to write it as a fixed-length string, or write the length (in a fixed-size format) followed by the actual string, or write a terminator at the end of the string. To get a C-style pointer to the string use std::string::c_str.

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you I will try this now. For the string I can user char array instead with fixed size. By the way do I need to serialize the data, does it have any performance benefits while dealing with large files in manner of reading and writing?
@Vecihi What you're doing when you write the structure to file is serialization.
Don't we need to use any serialization procedure like the ones in boost or standart c++ to convert and write the data as binary? Does that make any performance difference in writing & reading?
@Vecihi Technically, any conversion of data from one format to another could be called serialization. Boost serialization is one implementation, what you do is another. Performance wise I can't really say, you have to benchmark.
1

Welcome to the merry world of C++ std::

Basically, vectors are meant to be used as opaque containers.
You can forget about reinterpret_cast right away.
Trying to shut the compiler up will allow you to create an executable, but it will produce silly results.

Basically, you can forget about most of the std::vector syntactic sugar that has to do with iterators, since your fstream will not access binary data through them (it would output a textual representation of your data).

But all is not lost.

You can access the vector underlying array using the newly (C++11) introduced .data() method, though that defeats the point of using an opaque type.

const int * raw_ptr = data.data_block.data();

that will gain you 100 points of cool factor instead of using the puny

const int * raw_ptr = &data.data_block.data[0];

You could also use the even more cryptic &data.data_block.front() for a cool factor bonus of 50 points.

You can then write your glob of ints in one go:

f.write (raw_ptr, sizeof (raw_ptr[0])*data.data_block.size());

Now if you want to do something really too simple, try this:

for (int i = 0 ; i != data.data_block.size() ; i++)
    f.write (&data.data_block[i], sizeof (data.data_block[i]));

This will consume a few more microseconds, which will be lost in background noise since the disk I/O will take much more time to complete the write.

Totally not cool, though.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.