2

I've been trying to parse a text file and manipulate it with regular expressions. This is my script:

import re
original_file = open('jokes.txt', 'r+')
original_file.read()
original_file = re.sub("\d+\. ", "", original_file)

How to fix the following error:

Traceback (most recent call last):
File "filedisplay.py", line 4, in <module>
original_file = re.sub("\d+\. ", "", original_file)
File "C:\Python32\lib\re.py", line 167, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or buffer

And why am I getting this error?

4
  • 2
    original_file is a file object, you need to read it to get its contents, or the buffer that the regex requires. Commented May 9, 2014 at 16:06
  • Thank you, I've updated the code and it still throws an error :/ Commented May 9, 2014 at 16:09
  • 1
    Err, you didn't put the buffer in the variable original_file, so you're still using a file object in the regex. Why don't you use another variable? Like contents = original_file.read()? Commented May 9, 2014 at 16:11
  • That solved the problem... (Newbie here!) Commented May 9, 2014 at 16:13

2 Answers 2

3

original_file is a file object, you need to read it to get its contents, or the buffer that the regex requires.

Usually, it's also good that you use with (just so you don't have to remember closing the file), so you might end up with something like this:

import re

with open('jokes.txt', 'r+') as original_file:
    contents = original_file.read()
    new_contents = re.sub(r"\d+\. ", "", contents)

You will see I rawed the regex string up there in the code (I used an r before the regex string). That's also a good practice, because sometimes you will have to double escape some characters for them to behave properly as you expect them.

Sign up to request clarification or add additional context in comments.

1 Comment

This is explanatory... Thanks :)
1

You call original_file.read(), but you don't assign that value to anything.

>>> original_file = open('test.txt', 'r+')
>>> original_file.read()
'Hello StackOverflow,\n\nThis is a test!\n\nRegards,\naj8uppal\n'
>>> print original_file
<open file 'test.txt', mode 'r+' at 0x1004bd250>
>>> 

Therefore, you need to assign original_file = original_file.read():

import re
original_file = open('jokes.txt', 'r+')
original_file = original_file.read()
original_file = re.sub("\d+\. ", "", original_file)

I would also suggest using with like @Jerry, so that you don't have to close the file to save the writing.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.