0

In the book I'm reading, It started talking about binaries, and how we can output to a binary file similarly to how we can output to a text file. So I started reading more and wanted to give it a try; however, I've ran into what seems like a simple issue, but one that I'm not understanding properly considering my lack of understanding when it comes to binaries files.

So lets say I created a structure, and a function. Like the following.

struct celebrities 
{
    char name[15];
    char lastName[15];
};
void BinaryCreation(celebrities );
int main()
{
    celebrities actors = { "Denzel", "Washington" };

    BinaryCreation( actors);

    system("pause");

}

Now, I'll create a binary file:

void BinaryCreation(celebrities actors)
{

        fstream file;
        file.open("binaryfile.txt", ios::binary | ios::out);

Now, in the book it states that I should write something like the following to output it into binary:

file.write(address, size)

Which is where it gets confusing seeing as how if I have a structure, how exactly do I do that? I tried the following:

file.write(&actors.name, sizeof(actors.name));
file.write(&name, sizeof(name));

Also tried reinterpret cast. I also did the following

file.write(actors.name, sizeof(actors.name));

which worked in the sense of no errors, but it would output to file in text form (ASCII).

I'm sure this is very simple, and I'm overlooking something, but at the moment I can't figure it out.

2
  • 1
    Changing the output type to binary does not magically change your text into something else. It's still text, and will still be written as the same values. The difference is that binary output can contain characters that aren't text (like NULL, or 0x01 or 0x02). Commented Apr 23, 2017 at 3:23
  • @KenWhite I see, that's what threw me off, I was expecting a bunch of symbols, and instead got the same text. Thanks for the input. Commented Apr 23, 2017 at 3:35

1 Answer 1

1

The correct way to write the raw memory contents of the object would be:

file.write(reinterpret_cast<char *>(&actors), sizeof(actors));

but it would output to file in text form (ASCII).

Well, your structure contains only text, so text is what you will see when you open it.

Also I'm guessing that you did not open the file in a hex editor. If you did, you would see that each field occupies 15 bytes, regardless of whether the text contained in each character array occupies less space. The extra padding bytes between the fields may not be represented as printable characters in the program you were using to view the contents of the file.

For example, given this program:

#include <iostream>

struct celebrities
{
    char name[15];
    char lastName[15];
};

int main() {
    celebrities actors = { "Denzel", "Washington" };

    std::cout.write(reinterpret_cast<char *>(&actors), sizeof(actors));

    return 0;
}

Compiling this and piping the program's output to xxd gives the following:

0000000: 4465 6e7a 656c 0000 0000 0000 0000 0057  Denzel.........W
0000010: 6173 6869 6e67 746f 6e00 0000 0000       ashington.....

Each field occupies exactly 15 bytes. The unused space after the null string terminator are additional null characters (byte 0). If you had previously stored a longer string one of the object's fields, you might see remnants of it in the output file.

If we #include <cstring> and add this line directly above the call to std::cout.write() in the above program:

std::strcpy(actors.lastName, "Whitaker");

Running the program now produces this content:

0000000: 4465 6e7a 656c 0000 0000 0000 0000 0057  Denzel.........W
0000010: 6869 7461 6b65 7200 6e00 0000 0000       hitaker.n.....

Note the lone n, left over from the end of the previous value, "Washington".

Sign up to request clarification or add additional context in comments.

7 Comments

Yeah that second code worked. I just always thought that binary was practically symbols and numbers, basically the way your output came out. I'll download a hex editor to check it out.
@ReMaKe Technically binary files (all files, in fact) have no inherent meaning; there needs to be an interpretation of their contents for them to be useful. Data stored in memory as ASCII gets written out in ASCII (unless it is first encoded as something else) so what you get in your output file is whatever is stored in each array -- which is ASCII. So the file looks like text because, well... that's what is stored in it! Note that "my output" is the program's output piped through a hex-dump tool which displays the hex code for each byte.
I see. I'm sure the book will explain it in the following pages, chapters, but is there any immediate reason I would want to output to binary, instead of a regular output file?
@ReMaKe Well, for that comparison you have to define "regular." Basically every kind of data file is going to have some kind of structure to it. The thing about serializing POD (plain-old-data) types by writing them out directly is that you don't have to do any work to define how the data is structured in the file; it's just a dump directly from memory. In essence, the compiler defines the layout. The problem with this approach is that it necessarily depends on the compiler and possibly also some things about the machine running the code.
The only real guarantee with the "dump memory" approach is that you can read the file back on the same computer with a program having an identical structure and compiled with the same compiler version. Beyond that, all bets are pretty much off. So it's good for toy programs or experiments. I would hesitate to use it anywhere you actually care about your data.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.