for a server based j2ee application, I need to retrieve the number of pages from word documents.. any ideas what works?
5 Answers
If the documents are modern Word 2007 format you can use direct XML-based manipulation, through OOXML. This is by far the better long term solution, though I realize it may not be realistic for an entire organization to change overnight.
If they are older Word formats, you're probably stuck with server-side Word/Excel/Powerpoint/Outlook programmable object models, although you're not supposed to do that on the server..
Comments
Regarding Office Open XML support, the latest beta of Java-POI is supposed to support it.
Comments
Haven't used it before but you could try Apache POI. Looks like it has a WordCount function.
Comments
To read the page count of MS Office files you can use aspose libraries (aspose-words, aspose-cells, aspose-slides).
Examples:
Excel: number of pages of the printable version of the Woorkbook:
import com.aspose.cells.*;
public int getPageCount(String filePath) throws Exception {
Workbook book = new Workbook(filePath);
ImageOrPrintOptions imageOrPrintOptions = new ImageOrPrintOptions();
// Default 0 Prints all pages.
// IgnoreBlank 1 Don't print the pages which the cells are blank.
// IgnoreStyle 2 Don't print the pages which cells only contain styles.
imageOrPrintOptions.setPrintingPage(PrintingPageType.IGNORE_STYLE);
int pageCount = 0;
for (int i = 0; i < book.getWorksheets().getCount(); i++) {
Worksheet sheet = book.getWorksheets().get(i);
PageSetup pageSetup = sheet.getPageSetup();
pageSetup.setOrientation(PageOrientationType.PORTRAIT);
pageSetup.setPaperSize(PaperSizeType.PAPER_LETTER);
pageSetup.setTopMarginInch(1);
pageSetup.setBottomMarginInch(1);
pageSetup.setRightMarginInch(1);
pageSetup.setLeftMarginInch(1);
SheetRender sheetRender = new SheetRender(sheet, imageOrPrintOptions);
int sheetPageCount = sheetRender.getPageCount();
pageCount += sheetPageCount;
}
return pageCount;
}
Word: number of pages:
import com.aspose.words.Document;
public int getPageCount(String filePath) throws Exception {
Document document = new Document(filePath);
return document.getPageCount();
}
PowerPoint: number of slides:
import com.aspose.slides.*;
public int getPageCount(String filePath) throws Exception {
Presentation presentation = new Presentation(filePath);
return presentation.getSlides().toArray().length;
}