3

I am creating a website in which authors can create EPUB files. Users will be uploading their books in the .doc format. I need to create EPUB file out of that. One single doc file will be having multiple chapters. So I need to parse the doc file and split it into chapters. Authors will be using Heading 1 for their chapter titles.

So in PHP, is there any way to parse doc files to HTML and split it into chapters using Heading 1, so that I can create EPUB file.

After some research, I got one linux app. But I think, it will convert doc to plain text. So I will not be able to split the chapters.

Please suggest me the a solution if you have. Thanks in advance.

2

1 Answer 1

1

You can achieve this using PHPDOCX API.

First try to generate the XHTML from your Word document using this function reference

Something like this..

require_once '../../classes/TransformDoc.inc';

$document = new TransformDoc();
$document->setStrFile('../files/Text.docx');
$document->generateXHTML();
$document->validatorXHTML();
echo $document->getStrXHTML();

After getting the XHTML content you can do various processings like removing chapter,etc.

Complete documentation can be found here.

Sign up to request clarification or add additional context in comments.

1 Comment

just want to add that this tool is not free. Moreover, phpdocx convertion functions are avaible from pro+ (~149$)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.