I am trying to get xml file of all my pdfs in the path, and for that I want to use pdfminer code from https://github.com/euske/pdfminer/blob/master/tools/pdf2txt.py on python 3. I installed pdfminer.six and all the related packages as well. However there is one problem with opening the files with file('', 'rb'), which should be replaced with open ('', 'rb'). But the output that I get from open () is not the same with file (), therefore the function written on github link is not running my pdf files.
For instance,
in python 2, the following code returns;
fp = file(filepath, 'rb')
<open file '...\\lehsfil2.pdf', mode 'rb' at 0x04ADB180>
whereas the correspondence of file() function in python 3 returns;
fp = open(filepath, 'rb')
<_io.BufferedReader name='...\\lehsfil2.pdf'>
Is there a direct correspondence of file() in python2, to python3, which returns the same object?
openis returning a file the same way asfile(see stackoverflow.com/questions/32131230/python-file-function for details). In python3filehas been removed butopencontinue to work as in python2 with slightly the same API. What error do you have when you try to replacefilebyopenin the code? I took a look at the library and all the use of the api should be present on both python2 and python3.