I want to get the content (text only) in a ppt file. How to do it?
(It likes that if I want to get content in a txt file, I just need to open and read. What do I need to do to get information from ppt files?)
By the way, I know there is a win32com in windows system. But now I am working on linux, is there any possible way?
1 Answer
I found this discussion over on Superuser:
Command line tool in Linux to Extract Text From Word, Excel, Powerpoint?
There are several reasonable answers listed there, including using LibreOffice to do this (and for .doc, .docx, .pptx, etc, etc.), and the Apache Tika Project (which appears to be the 5,000lb gorilla in this solution space).
stringson the ppt?