How to read Delphi record structure in Java

Question

I have a binary file which consists of Delphi records. The record looks like:

TRMapFileHeader = record
    FileType: String[8];
    Points: Int64;
    Objects: Int64;
    Text: Int64;
    ObjLayers: byte;
    TextLayers: byte;
  end;

I want to read this file in Java. I opened the file:

DataInputStream file = new DataInputStream(new FileInputStream(filename))

and then I've tried to read data:

for(int i = 0; i<8; i++)
    System.out.print((char)file.readByte());
System.out.println();
System.out.println(file.readLong());
System.out.println(file.readLong());
System.out.println(file.readLong());
System.out.println(file.readByte());
System.out.println(file.readByte());

and I've got

eclipse output

instead of correct data which are:

I played with different ways of reading and found out the next:

System.out.println(file.readByte());
for(int i = 0; i<3; i++)
    System.out.print((char)file.readByte());

for(int i = 0; i<36; i++)
    file.readByte();

System.out.println();
System.out.println(file.readByte());
System.out.println(file.readByte());

gives the next output: Eclipse output. First byte equals 3, then goes 3 characters, then 36 bytes and then last 2 parameters of record

So I'm wondering how to read this kind of records

Consider using packed records in Delphi so you don't have to deal with alignment. — Marcus Adams
– Marcus Adams, Commented Oct 27, 2013 at 21:30
Why would you use packed records? That will cause breakage elsewhere if you to reuse the record somewhere else. — Johan
– Johan, Commented Oct 28, 2013 at 0:32
@MarcusAdams Hmm, not sure about that. Using records to binary blit data is so 1970s! BinaryWriter/BinaryReader, for example, would make more sense these days. Packing records just makes the performance suck. — David Heffernan
– David Heffernan, Commented Oct 28, 2013 at 13:03
i wonder why not just take ANY hex editor/viewer out there and parse the file using trials and errors, then recreate the parsing in java — Arioch 'The
– Arioch 'The, Commented Oct 29, 2013 at 11:41
@Arioch'The Well, I guess trial and error is what you might resort to if you could not work it out from first principles. But how would you know for sure that you had got it right. If you tossed a coin and got H,T,H,T,H,T you might conclude that coin tossing results in an alternating sequence. — David Heffernan
– David Heffernan, Commented Oct 29, 2013 at 17:01

David Heffernan · Accepted Answer · 2013-10-27 16:04:19Z

6

The Delphi type String[8] is a short string. Its implementation contains an extra lead byte containing the length of the string. So, the size of String[8] is 9 bytes.

You'll need to read the first byte to find the length, and then the next 8 bytes for the payload. Remember that the first byte tells you how many of the subsequent 8 bytes carry meaning.

The other thing to watch out for is alignment. As described in the question, the record would appear to be aligned. Whether or not it is depends upon the Delphi compiler settings. It's possible that the Delphi compiler was instructed to pack the records.

Let's assume not. In other words, let us assume that the record is aligned. In order for the fields to be aligned correctly, the Int64 fields will be aligned on 8 byte boundaries. Which means that the layout of the record will look this this:

Offset  Length  Field
 0      9       FileType, 1 byte length, 8 bytes payload
 9      7       <padding>
16      8       Points
24      8       Objects
32      8       Text
40      1       ObjLayers
41      1       TextLayers
42      6       <padding>

The total length of the record is 48 due to the padding at the end of the record. This will be important because if you don't skip over the padding at the end of the record, you'll be at the wrong place to read whatever comes next in the file.

A cursory examination of your output would indicate that the record is indeed aligned rather than packed. Your second block of code reads 40 bytes, and then the next two bytes (at offsets 41 and 42) are 11 and 4 which matches my table above.

One final point to note is that it is likely that the Delphi that generated these files uses little endian integers. Java is big endian (I believe), and so you'll need to perform a little to big endian conversion on the integer fields. For example using java.nio.ByteBuffer.

Let's check out this hypothesis. You state that the three longs that you read have these values:

6538107356104884224
5276531012929585152
7653586091739447296

And converted to hex we have:

5ABC060000000000
493A010000000000
6A37000000000000

Let's reverse the bytes (skipping the leading zero bytes):

6BC5A
13A49
376A

which in decimal are

441434
80457
14186

And those are your desired values. Phew, we got there in the end!

edited Oct 27, 2013 at 16:04

answered Oct 27, 2013 at 15:19

David Heffernan

616k46 gold badges1.1k silver badges1.5k bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Rayz Over a year ago

System.out.println(file.readByte()); for(int i = 0; i<8; i++) System.out.print((char)file.readByte()); for(int i = 0; i<7; i++) file.readByte(); System.out.println(); System.out.println(file.readLong()); System.out.println(file.readLong()); System.out.println(file.readLong()); System.out.println(file.readByte()); System.out.println(file.readByte()); Didn't help. The last two parameters are fine but 'Points', 'Objects' and 'Text' are 6538107356104884224, 5276531012929585152, 7653586091739447296

Rayz Over a year ago

I read 1 byte, then 8, then 7(padding) then 3 times LongInt(3 x 8 bytes) and last 2 bytes (which are correct '11' and '4') but those 3 LongInt values are wrong.

David Heffernan Over a year ago

OK, answer updated. Only plausible explanation is endianness. I guess Java is big endian.

Rayz Over a year ago

Can you please explain about padding? Why 7 and 6 and why are they where they are? Or can you share the link to the literature?

David Heffernan Over a year ago

In simple terms, padding is added so that fields start at an offset that is an exact multiple of the field's type's alignment. An Int64 has alignment 8 and so needs to be placed at offset 0 or 8 or 16 and so on. Read the Wikipedia topic on padding and alignment.

Collectives™ on Stack Overflow

How to read Delphi record structure in Java

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related