1

I am creating a file with some data objects inside. data object have different sizes and are something like this (very simplified):

struct Data{
   uint64_t size;
   char     blob[MAX_SIZE];
// ... methods here:
}

At some later step, the file will be mmap() in memory, so I want the beginning of every data objects to starts on memory address aligned by 8 bytes where uint64_t size will be stored (let's ignore endianness).

Code looks more or less to this (currently hardcoded 8 bytes):

size_t calcAlign(size_t const value, size_t const align_size){
    return align_size - value % align_size;
}

template<class ITERATOR>
void process(std::ofstream &file_data, ITERATOR begin, ITERATOR end){
    for(auto it = begin; it != end; ++it){
        const auto &data = *it;

        size_t bytesWriten = data.writeToFile(file_data);

        size_t const alignToBeAdded = calcAlign(bytesWriten, 8);

        if (alignToBeAdded != 8){
            uint64_t const placeholder = 0;

            file_data.write( (const char *) & placeholder, (std::streamsize) alignToBeAdded);
        }
    }
}

Is this the best way to achieve alignment inside a file?

9
  • What do you mean by "alignment"? To write at a specific position in the file? Have you tried seeking in the file to the wanted position? Commented Jul 21, 2017 at 9:06
  • I will edit the question Commented Jul 21, 2017 at 9:08
  • So, you want all data to be in 8 bytes format (64 bits). Is that correct? Commented Jul 21, 2017 at 9:11
  • 1
    Just a note about the "holes" you might get by seeking forward in a file and writing with a gap, those "holes" will be included when you memory map the file, the position of the structures will be 8-byte aligned in the memory mapping. Also, having "holes" in the file might make it use less space on the disk. If you fill the holes with data (even zeros) that will use space on the disk. Commented Jul 21, 2017 at 9:27
  • 2
    You have to round up. So it is 8 * ((offset + 7) / 8). Commented Jul 21, 2017 at 9:32

2 Answers 2

2

you don't need to rely on writeToFile to return the size, you can use ofstream::tellp

const auto beginPos = file_data.tellp();
// write stuff to file
const auto alignSize = (file_data.tellp()-beginPos)%8;
if(alignSize)
    file_data.write("\0\0\0\0\0\0\0\0",8-alignSize);

EDIT post OP comment: Tested on a minimal example and it works.

#include <iostream>
#include <fstream>
int main(){
    using namespace std;
    ofstream file_data;
    file_data.open("tempfile.dat", ios::out | ios::binary);
    const auto beginPos = file_data.tellp();
    file_data.write("something", 9);
    const auto alignSize = (file_data.tellp() - beginPos) % 8;
    if (alignSize)
        file_data.write("\0\0\0\0\0\0\0\0", 8 - alignSize);
    file_data.close();
    return 0;
}
Sign up to request clarification or add additional context in comments.

2 Comments

ofstream::tellp - nice. I use alignToBeAdded for something else, but will do it with ofstream::tellp. about the while - are sure more efficient than calc and write with single operation? (nice place to use ostream::put(char)
Actually, you can get rid of the while altogether and write only once as it's more efficient and safe as alignSize will never be >=8. Code amended
1

You can optimize the process by manipulating the input buffer instead of the file handling. Modify your Data struct so the code that fills the buffer takes care of the alignment.

struct Data{
    uint64_t size;
    char blob[MAX_SIZE];
    // ... other methods here

    // Ensure buffer alignment
    static_assert(MAX_SIZE % 8 != 0, "blob size must be aligned to 8 bytes to avoid Buffer Overflow.");

    uint64_t Fill(const char* data, uint64_t dataLength) {
        // Validations...

        memcpy(this->blob, data, dataLength);
        this->size = dataLength;

        const auto paddingLen = calcAlign(dataLength, 8) % 8;
        if (padding > 0) {
            memset(this->blob + dataLength, 0, paddingLen);
        }

        // Return the aligned size
        return dataLength + paddingLen;
    }
};

Now when you pass the data to the "process" function simply use the size returned from Fill, which ensures 8 byte alignment. This way you still takes care of the alignment manually but you don't have to write twice to the file.

note: This code assumes you use Data also as the input buffer. You should use the same principals if your code uses some another object to hold the buffer before it is written to the file.

If you can use POSIX, see also pwrite

1 Comment

I did something similar. I "push" align considerations into Data class.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.