1

I am writing a program to copy large files, so I want to read specific number of bytes and write to another file. I want to copy the file and get same number of bytes. But I am getting more. Plus I also want the contents of the file to remain same. What am I doing wrong here? If someone can explain why am I getting this extra text, that would be great.

test.txt

sometext sometext sometext sometext
sometext sometext sometext sometext
sometext sometext sometext sometext
sometext sometext sometext sometext

Practice.java

public class Practice{
    public static void main(String[] args){

    byte[] buffer = new byte[100];

    try{
        FileInputStream f = new FileInputStream("test.txt");
        FileWriter writer = new FileWriter("copy_test.txt");
        int b;
        while ((b=f.read(buffer)) != -1 )
            writer.write(new String(buffer));
        writer.close();
    } catch(Exception e){
        e.printStackTrace();
    }
 }
}

copy_test.txt

sometext sometext sometext sometext
sometext sometext sometext sometext
sometext sometext sometext sometext
sometext sometext sometext sometext
metext sometext sometext
sometext sometext sometext
4
  • Erm, why don't you just use Files.copy()? Commented Feb 15, 2015 at 10:41
  • Also, you use an InputStream as a source and a Writer as destination? Huh? Basically you're reading apples but writing oranges. Commented Feb 15, 2015 at 10:44
  • @fge Because it;s just a sample code, actually I am transferring files from a server to a client. So I can't use Files.copy(). Commented Feb 15, 2015 at 10:49
  • Depends on the server. The java.nio.file API allows FileSystems over anything. See here for instance. And please notes that Files.copy() can copy from an InputStream to a Path as well. Commented Feb 15, 2015 at 10:54

1 Answer 1

5

There are several problems with your code:

  • You're using the default platform encoding to convert the binary data into text (by calling new String(byte[]) instead of specifying an encoding
  • You're using the default platform encoding to write the text out to disk (by using FileWriter)
  • You're unconditionally converting the whole of your buffer into text, even if your read didn't fill it - you can fix that by using one of the String constructor overloads taking an offset and number of bytes, and passing 0 and b for arguments, although I wouldn't. Use a try-with-resources statement.
  • You're not closing the input stream at all, and if an exception is thrown you're not closing the writer. Use a try-with-resources statement.
  • You're catching Exception - that's usually a bad idea. Catch specific exceptions if you must; personally I have very few catch blocks - generally if something goes wrong, it's appropriate for that to abort the whole of the current operation. (There are cases where you can retry etc, of course.) I understand this may just be in your sample code here, and not in your real code.
  • If you're just trying to copy a file, there's no reason to convert between binary and text at all - stick with input streams and output streams
  • If you're just trying to copy a file, you can use Files.copy to avoid having to write any code at all, assuming you're using Java 7. (And if you're not, you should be!)

If you just want to copy an InputStream to an OutputStream (and you haven't got a utility library available - this is part of a lot of libraries) you can just use something like:

byte[] buffer = new byte[1024];
int bytesRead;
while ((bytesRead = input.read(buffer)) != -1) {
    output.write(buffer, 0, bytesRead);
}
Sign up to request clarification or add additional context in comments.

10 Comments

why buffer of size 1024?
@user3834119: Why not? It feels like a reasonable balance between memory usage and excessive IO, although there may well be buffering in the stream too. It's a somewhat arbitrary number - as was your 100, of course.
@JonSkeet the Files library uses 8kb - maybe 1kb is a little small. Although with a smallish file it would most likely make absolutely no difference.
1 kiB is too small, really. The page granularity on most modern architectures today is 8kiB.
@fge - the page granularity is not directly relevant. We are talking about files here. The problem with doing reads / writes using buffers that are too small is the overheads in doing syscalls and context switches. But yes, 1kb is probably too small.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.