3

I am using PDFBox 2.0. While parsing a PDF document, I also want to get first page as image and store it to hbase for using it in search results(I am going to create a search list page like search page of amazon.com).

HBase accepts byte[] variable to store(index) a value. I need to convert the image as byte[], then store it to HBase. I have implemented image render, but how can I convert it to byte[]?

        PDDocument document = PDDocument.load(file, "");
        BufferedImage image = null;
        try {
            PDFRenderer pdfRenderer = new PDFRenderer(document);
            if (document.isEncrypted()) {
                try {
                    System.out.println("Trying to decrypt...);
                    document.setAllSecurityToBeRemoved(true);
                    System.out.println("The file has been decrypted in .");
                }
                catch (Exception e) {
                    throw new Exception("cannot be decrypted. ", e);
                }
            }
            PDPage firstPage = (PDPage) document.getDocumentCatalog().getPages().get(0);
            pdfRenderer.renderImageWithDPI(0, 300, ImageType.RGB);
               // 0 means first page.

            image = pdfRenderer.renderImageWithDPI(0, 300, ImageType.RGB);                  
            document.close();

    } catch (Exception e) {
            e.printStackTrace();
    } 

If I write ImageIOUtil.writeImage(image , fileName+".jpg" ,300); above right above document.close();, program creates a jpg file in project path. I need to put it in a byte[] array instead of creating a file. Is it possible?

1 Answer 1

4

This can be done with ImageIO.write(Image, String, OutputStream) which can write to an arbitrary OutputStream rather than disk. ByteArrayOutputStream can store the output bytes into an array in memory.

import java.io.ByteArrayOutputStream;
...
// example image
BufferedImage image = new BufferedImage(4, 3, BufferedImage.TYPE_INT_ARGB);

// to array
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ImageIO.write(image, "jpg", bos);
byte [] output = bos.toByteArray();
System.out.println(Arrays.toString(output));
Sign up to request clarification or add additional context in comments.

3 Comments

What library is ByteOutputStream using? is it com.sun.xml.internal.messaging.saaj.util.ByteOutputStream;?
My bad, should have been java.io.ByteArrayOutputStream which is core Java class, updated answer...
Thank you so much. Now I have to think about how I can get it from hbase and show it as an image on search list.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.