I have an word document Docx file
As you can see in the word document there are a number of questions with Bullet Points. Right now I am trying to extract each paragraph from the file using apache POI. Here is my current code
public static String readDocxFile(String fileName) {
try {
File file = new File(fileName);
FileInputStream fis = new FileInputStream(file.getAbsolutePath());
XWPFDocument document = new XWPFDocument(fis);
List<XWPFParagraph> paragraphs = document.getParagraphs();
String whole = "";
for (XWPFParagraph para : paragraphs) {
System.out.println(para.getText());
whole += "\n" + para.getText();
}
fis.close();
document.close();
return whole;
} catch (Exception e) {
e.printStackTrace();
return "";
}
}
The problem with above method is that it is printing each line instead of paragraphs. Also the bullet points are also gone from extracted whole string. The whole is returned a plain string.
Can anyone explain what I am doing wrong. Also please suggest if you have a better idea to solve it.