1

I am trying to read an old .dat file byte by byte, and have run into an issue: a record is terminated by \n (newline). I'd like to read in the whole byte array, then split it on the character.

I can do this by reading the whole byte array from the file, creating a String with the contents of the byte array, then calling String.split(), but find this to be inefficient. I'd rather split the byte array directly if possible.

Can anyone assist?

Update: Code was requested.

public class NgcReader {

public static void main(String[] args) {

    String location;
    if (System.getProperty("os.name").contains("Windows")) {
        location = "F:\\Programming\\Projects\\readngc\\src\\main\\java\\ngcreader\\catalog.dat";
    } else {
        location = "/media/My Passport/Programming/Projects/readngc/src/main/java/ngcreader/catalog.dat";
    }

    File file = new File(location);

    InputStream is = null;
    try {
        is = new FileInputStream(file);
    } catch (FileNotFoundException e) {
        System.out.println("It didn't work!");
        System.exit(0);
    }

    byte[] fileByteArray = new byte[(int) file.length() - 1];

    try {
        is.read(fileByteArray);
        is.close();
    } catch (IOException e) {
        System.out.println("IOException!");
        System.exit(0);
    }

    // I do NOT like this. I'd rather split the byte array on the \n character
    String bigString = new String(fileByteArray);
    List<String> stringList = Arrays.asList(bigString.split("\\n"));
    for (String record : stringList) {
        System.out.print("Catalog number: " + record.substring(1, 6));
        System.out.print(" Catalog type: " + record.substring(7, 9));
        System.out.print(" Right Ascension: " + record.substring(10, 12) + "h " + record.substring(13, 17) + "min");
        System.out.print(" Declination: " + record.substring(18, 21) + " " + record.substring(22, 24));
        if (record.length() > 50) {
            System.out.print(" Magnitude: " + record.substring(47, 51));
        }

        if (record.length() > 93) {
            System.out.print(" Original Notes: " + record.substring(54,93));
        }

        if (record.length() > 150) {
            System.out.print(" Palomar Notes: " + record.substring(95,150));
        }
        if (record.length() > 151) {
            System.out.print(" Notes: " + record.substring(152));
        }
        System.out.println();
    }

}

Another Update: Here's a README with a description of the file I'm processing:

http://cdsarc.u-strasbg.fr/viz-bin/Cat?VII/1B

7
  • can you show your code so far? Commented Oct 4, 2011 at 13:33
  • 3
    It's not clear whether this is a text file, in which case you should load it as text, or a binary file, in which case you shouldn't be talking about characters. Commented Oct 4, 2011 at 13:34
  • yes, code! Also, is this .dat file a Text file or binary file? Commented Oct 4, 2011 at 13:35
  • Inefficient? How big is that file? Commented Oct 4, 2011 at 13:35
  • The file is a binary file, but the bytes translate directly to ASCII. Commented Oct 4, 2011 at 13:37

2 Answers 2

2

It sounds like this might actually just be a text file to start with, in which case:

InputStream stream = new FileInputStream(location);
try {
    BufferedReader reader = new BufferedReader(new InputStreamReader(stream,
                                                                     "ASCII"));
    String line;
    while ((line = reader.readLine()) != null) {
        // Handle the line, ideally in a separate method
    }
} finally {
    stream.close();
}

This way you never need to have more than a single line of the file in memory at a time.

Sign up to request clarification or add additional context in comments.

1 Comment

Yes, that was it. The description in the text file described it byte by byte, so I was trying to process it that way. Stupid of me.
2

if you're set on using byte arrays...

byte[] buff = new byte[1024];//smaller buffer

try {
    int ind=0,from=0,read;
    while((read=is.read(buff,ind,buff.length-ind))!=-1){
        for(int i=ind;i<ind+read;i++){
            if(buff[i]=='\n'){
                string record = new String(buff,from,i+1);
                //handle
                from=i+1;
            }
        }
        System.arraycopy(buff,from,buff,0,buff.length-from);
        ind=ind+read-from;
        from=0;
    }

} catch (IOException e) {
    System.out.println("IOException!");
    //System.exit(0);
    throw RunTimeException(e);//cleaner way to die
} finally{
    is.close();
}

this also avoids loading in the entire file and it puts the close inside a finally

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.