0

I'm writing a program that writes arrays and the information regarding them to a binary file.

My first approach was to call fwrite 4 times: once for general information regarding the array, once for the timestamp, once for the dimension of the array and once to write the array itself. This approach worked as it is quite simple, but the execution times were too slow, seeing as the program is multithreaded and it writes to a SAS drive frequently, flooding the drive with requests which presented a bottleneck.

The new approach was to create an array of structs containing the information needed, my struct would be as follows:

struct array_data{
    int information;
    int timestamp;
    int size;
    int* data_array;
}

During execution I would write the data to a buffer and when I had everything I need it would call a malloc to allocate array_data.data_array and copy everything from the buffer from inside a for loop.

The issue is when I call fwrite to write the whole struct, the first 3 members of the struct are written correctly, while the array is not and that is due to the address of the array not being contiguous, since it points to another place in memory after the malloc.

The best solution to this would be to declare the data_array as a static array, this way the fwrite would work as I need it to, but then I would have to call fwrite for every struct, instead of calling it once to write an array of structs, which would impact the performance, negating the use of the struct.

I've also tried using an array of dynamically allocated structs, by declaring my struct as follows:

struct array_data{
    int information;
    int timestamp;
    int size;
    int data_array[];
}

and allocating the array of structs using malloc, but the address of struct_array[1].information is not the one right after the struct_array[0].data_array[last_index], there seems to be another 5 bytes in between, so if I were to call fwrite with struct_array the data in the file would still be incorrect.

Is there a way to use structs to solve this issue or should I just stick with writing my arrays to the file as I did in the first place?

10
  • You can write the whole structo file, even with the address of data. Upon reading back you must ignore this address. After writing the struct you write size integers contained in data. When reading back the struct you allocate size integers in the data pointer and then read those from the file into the allocated array. Commented Nov 27, 2018 at 14:00
  • As for your question "Is there a way to use structs to solve this issue": no there is no way (because the data is a dynamic size allocated at run-time). Commented Nov 27, 2018 at 14:02
  • 1
    Generally if I need to make few larger writes than many smaller writes I will accumulate the data in a buffer in memory exactly the same way it would exist in the file, and when that buffer is full I will write it to the file. The format of the data would likely be what you've done. Add the static part of the struct, the size of the dynamic data, and the dynamic data right after it. Commented Nov 27, 2018 at 14:32
  • 1
    "the address of the array not being contiguous", this is a bit misleading statement. The value of data_array is stored contiguously inside the struct, but the value itself is a pointer to a certain place in memory. If this area doesn't change, your serializer function could simply write the first three values, and then the actual array. But even this is not portable: struct layout in memory and integer endianness are compiler- and architecture-dependent. The best approach would be to have a plain buffer and write the values using a specific format (protocol), before flushing it. Commented Nov 27, 2018 at 14:43
  • 1
    In order to answer this, we will need to see the code doing the file access and the code doing the allocation. Commented Nov 27, 2018 at 14:45

1 Answer 1

0

The following example creates, writes and reads your data. It is just a outline. Error checks on malloc, fread and fwrite ommitted:

#define N_DATA 10
#define N_INTS 5

struct array_data{
    int information;
    int timestamp;
    int size;
    int* data_array;
};
struct array_data arr[N_DATA];

void makeData(void){
    int i;
    for (i=0;i<N_DATA;i++) {
        arr[i].data_array=malloc(N_INTS*sizeof(int));
        arr[i].size= N_INTS;
    }
}
void writeData(FILE *fp_out)
{
    int i;
    for (i=0;i<N_DATA;i++) {
        fwrite(&arr[i],sizeof(arr[i]),1,fp_out);
        fwrite(arr[i].data_array,arr[i].size*sizeof(int),1,fp_out);
    }
}
void readData(FILE *fp_in)
{
    int i= 0;
    while(fread(&arr[i],sizeof(arr[i]),1,fp_in)==1) {
        arr[i].data_array=malloc(arr[i].size*sizeof(int));
        fread(arr[i].data_array,arr[i].size*sizeof(int),1,fp_in);
        i++;
    }
}
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your example, this is what I was considering doing, but using my first approach with a larger buffer for writing as was suggested I was able to improve the execution times

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.