Http java file download problem

Question

Trying to download file with apache httpclient library and have a problem with resulting file being smaller than the original (approximately 32-32kb, when normal file size is 92-93) and cannot be opened normally in pdf viewer. Can someone explain me why this can be happening ? (Using firefox to download this file can sometimes lead to file being downloaded fully and sometimes being downloaded partly)

Here is code I was using to download file via URL

    URL url = new URL("pathtofile");
    final URLConnection connection = url.openConnection();

    final InputStream is = connection.getInputStream();
    FileOutputStream fos = new FileOutputStream("C://result1.pdf");

    byte buffer[] = new byte[1024];
    int bytesRead; 
    while ((bytesRead = is.read(buffer)) >= 0) {        
        fos.write(buffer, 0, bytesRead);
    }
    fos.flush();
    fos.close();
    is.close();

P.S. Was trying to download this file using HttpClient apache library, same result.

UPDATED: Monitoring traffic with network tool I found the difference between receiving file via Firefox and application.

With Firefox first HttpPayloadLine was :

HTTPPayloadLine: 83 Td /F2 5.80476 Tf (A:\040Asinis\04017.12.10\04008:32\040laboratorij) Tj 100 Tz 1 1 1 rg /F1 5.80476 Tf 0 0 0 rg 104.4856 0 Td <0145> Tj 1 1 1 rg 0 0 0 rg 3.62799 0.72565 Td /F2 5.80476 Tf (\040) Tj 1 1 1 rg 0.83137 0.81569 0.78431 RG ET 51

With application first HttpPayload was

HTTPPayloadLine: CWgC,ú&ÿ3@Î"Ý¯V¨®~×>×)\ªleÚlµï½ci ¤Ãðð'È/CÈAø¯ª ÍübA«1Ãÿ Åç«VÉ¬ZòYóóy7»ÇH.o²e<qZna3l±°¥þ6ñþ[2YÚ1ì³Eë-ÓÊÏ$y:tÎà![ËÅS¤¿É¡¢è,þ|ºs¨)@¢Qâ¯ÝF~}oµÒ>¦ OAxz³äÒ.ß9 æÃZ¤ùÒ¨*«øUÎ®+4×

This measurements was taken via Microsoft Network Monitor

LAST UPDATE It was a server problem after all, after they fixed that files are downloaded successful

If you read the URL connection input stream you will read "everything" that comes from server. If "pathtofile" is a http request the servers output will include some header information that cannot be processed by the pdf viewer. What is the contents of the file that is downloaded so far ? It is starting with "%PDF" ? — PeterMmm
– PeterMmm, Commented Jun 2, 2011 at 6:32
Curious: Is there a Content-Encoding header on the HTTP response? — EricLaw
– EricLaw, Commented Jun 2, 2011 at 13:37
Regarding that firefox sometimes downloads fine and sometimes not i would investigate the server further, if you have access to the details. — PeterMmm
– PeterMmm, Commented Jun 2, 2011 at 14:14

Pintac · Accepted Answer · 2011-06-01 09:29:33Z

3

Try changing to

while ((bytesRead = in.read(buffer)) != -1) {
 byte[] tmp = ArrayUtils.subarray(buffer, 0, bytesRead);
 fos.write(tmp);
}

you mite get 0 bytes back but that does not mean its finished.Also write only bytes that you received not buffer.

answered Jun 1, 2011 at 9:29

Pintac

1,5853 gold badges22 silver badges40 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Costi Ciudatu Over a year ago

The java.io.OutputStream already provides a write(byte b[], int off, int len) method.

Pintac Over a year ago

yip i know just tried to make more clear what i am doing. fos.write(buffer, 0, bytesRead)

Costi Ciudatu · Accepted Answer · 2011-06-01 09:34:29Z

0

The first thing that I spotted is that you check whether is.read(buffer) > 0, which is wrong, as it may (in theory, at least) return 0 even if it's not reached the end of file. InputStream.read() will return -1 when EOF is reached, so make that comparison >= 0.

EDIT: The second thing that I spotted (a little late, as it's already been noticed in other answers) is that you're writing the whole buffer to the output stream no matter how much of it was actually affected by the latest read operation. Try something like:

byte[] buffer = new byte[BUFFER_SIZE];
int bytesRead;
while ( (bytesRead = in.read(buffer)) >= 0 ) {
    out.write(buffer, 0, bytesRead);
}

edited Jun 1, 2011 at 9:34

answered Jun 1, 2011 at 9:29

Costi Ciudatu

38.5k7 gold badges58 silver badges95 bronze badges

2 Comments

Costi Ciudatu Over a year ago

@artjomka: Have you also modified your code like in my latest edit or in Pintac's answer ?

artjomka Over a year ago

I modified it like in you latest edit (And update main post with changes)

Dikla · Accepted Answer · 2011-06-01 09:38:41Z

0

Maybe reading the error stream can give you some information:

connection.getErrorStream();

answered Jun 1, 2011 at 9:38

Dikla

3,4635 gold badges34 silver badges45 bronze badges

3 Comments

artjomka Over a year ago

cant find getErrorStream method for URLConnection class

Dikla Over a year ago

Right, it's a method of HttpURLConnection. If your connection is HTTP, do: ((HttpURLConnection)connection).getErrorStream();

artjomka Over a year ago

This method returns null in my case

artbristol · Accepted Answer · 2011-06-02 13:44:35Z

0

Can you use org.apache.commons.io.FileUtils.copyURLToFile(URL, File) instead?

answered Jun 2, 2011 at 13:44

artbristol

32.5k5 gold badges73 silver badges104 bronze badges

Collectives™ on Stack Overflow

Http java file download problem

4 Answers 4

2 Comments

2 Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

2 Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related