6

So I've been trying to make a small program that inputs a file into a byte array, then it will turn that byte array into hex, then binary. It will then play with the binary values (I haven't thought of what to do when I get to this stage) and then save it as a custom file.

I studied a lot of internet code and I can turn a file into a byte array and into hex, but the problem is I can't turn huge files into byte arrays (out of memory).

This is the code that is not a complete failure

public void rundis(Path pp) {
    byte bb[] = null;

    try {
        bb = Files.readAllBytes(pp); //Files.toByteArray(pathhold);
        System.out.println("byte array made");
    } catch (Exception e) {
        e.printStackTrace();
    }
    if (bb.length != 0 || bb != null) {
        System.out.println("byte array filled");
        //send to method to turn into hex
    } else {
        System.out.println("byte array NOT filled");
    }

}

I know how the process should go, but I don't know how to code that properly.

The process if you are interested:

  • Input file using File
  • Read the chunk by chunk of the file into a byte array. Ex. each byte array record hold 600 bytes
  • Send that chunk to be turned into a Hex value --> Integer.tohexstring
  • Send that hex value chunk to be made into a binary value --> Integer.toBinarystring
  • Mess around with the Binary value
  • Save to custom file line by line

Problem:: I don't know how to turn a huge file into a byte array chunk by chunk to be processed. Any and all help will be appreciated, thank you for reading :)

4
  • How big is the file? Commented Sep 8, 2016 at 20:15
  • somewhere around 7GB Commented Sep 8, 2016 at 20:17
  • Look at FileInputStream#read(byte[] b). Then you can specify how many bytes to read at a time. Commented Sep 8, 2016 at 20:21
  • If im not asking too much, can you give some examples? or even a link to an example, i read it but im not sure how to implement it exactly. :) Commented Sep 8, 2016 at 20:36

2 Answers 2

14

To chunk your input use a FileInputStream:

    Path pp = FileSystems.getDefault().getPath("logs", "access.log");
    final int BUFFER_SIZE = 1024*1024; //this is actually bytes

    FileInputStream fis = new FileInputStream(pp.toFile());
    byte[] buffer = new byte[BUFFER_SIZE]; 
    int read = 0;
    while( ( read = fis.read( buffer ) ) > 0 ){
        // call your other methodes here...
    }

    fis.close();
Sign up to request clarification or add additional context in comments.

1 Comment

Don't mention it. ;)
9

To stream a file, you need to step away from Files.readAllBytes(). It's a nice utility for small files, but as you noticed not so much for large files.

In pseudocode it would look something like this:

while there are more bytes available
    read some bytes
    process those bytes
    (write the result back to a file, if needed)

In Java, you can use a FileInputStream to read a file byte by byte or chunk by chunk. Lets say we want to write back our processed bytes. First we open the files:

FileInputStream is = new FileInputStream(new File("input.txt"));
FileOutputStream os = new FileOutputStream(new File("output.txt"));

We need the FileOutputStream to write back our results - we don't want to just drop our precious processed data, right? Next we need a buffer which holds a chunk of bytes:

byte[] buf = new byte[4096];

How many bytes is up to you, I kinda like chunks of 4096 bytes. Then we need to actually read some bytes

int read = is.read(buf);

this will read up to buf.length bytes and store them in buf. It will return the total bytes read. Then we process the bytes:

//Assuming the processing function looks like this:
//byte[] process(byte[] data, int bytes);
byte[] ret = process(buf, read);

process() in above example is your processing method. It takes in a byte-array, the number of bytes it should process and returns the result as byte-array.

Last, we write the result back to a file:

os.write(ret);

We have to execute this in a loop until there are no bytes left in the file, so lets write a loop for it:

int read = 0;
while((read = is.read(buf)) > 0) {
    byte[] ret = process(buf, read);
    os.write(ret);
}

and finally close the streams

is.close();
os.close();

And thats it. We processed the file in 4096-byte chunks and wrote the result back to a file. It's up to you what to do with the result, you could also send it over TCP or even drop it if it's not needed, or even read from TCP instead of a file, the basic logic is the same.

This still needs some proper error-handling to work around missing files or wrong permissions but that's up to you to implement that.


A example implementation for the process method:

//returns the hex-representation of the bytes
public static byte[] process(byte[] bytes, int length) {
    final char[] hexchars = "0123456789ABCDEF".toCharArray();
    char[] ret = new char[length * 2];
    for ( int i = 0; i < length; ++i) {
        int b = bytes[i] & 0xFF;
        ret[i * 2] = hexchars[b >>> 4];
        ret[i * 2 + 1] = hexchars[b & 0x0F];
    }
    return ret;
}

4 Comments

Thanks for the detailed explanation man :) but can you explain a little bit about that part that says "process(buf, read)". What exactly is process?
This is meant to be your processing function which "does something" with the bytes. I've added a example implementation which returns the hex-representation of the bytes.
This is stupid so help at your own risk :) i tried assigning the values i got from buffer array to another array, it didnt work. Thanks for going out of your way to help man :) Edit dont help me with the stupid thing i mentioned earlier, ill figure it out
can you specify exactly how I can return byte[]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.