0

So I tried to write a simple function to clean a text and summarize it:

def getTextWaPo(url):
page = urllib2.urlopen(url).read().decode('utf8')
soup = BeautifulSoup(page, "lxml")
text = ' '.join(map(lambda p: p.text, soup.find_all('article')))
return text.encode('ascii', errors='replace').replace("?"," ")

but for this piece of code I get this error :

  File "Autosummarizer.py", line 12, in getTextWaPo
  return text.encode('ascii', errors='replace').replace("?"," ")
  TypeError: a bytes-like object is required, not 'str'

  line 12 ==> text = getTextWaPo(articleURL)

what should I do?

2 Answers 2

0

Your are encoding data in line 12 using you have to use bytes. as replace(b"?", b" ")

The code look like

import requests
from urllib.request import urlopen
from bs4 import BeautifulSoup
def getTextWaPo(url):
    page = urlopen(url).read().decode('utf8')
    soup = BeautifulSoup(page, "lxml")
    text = ' '.join(map(lambda p: p.text, soup.find_all('article')))
    return text.encode('ascii', errors='replace').replace(b"?",b" ")
getTextWaPo("https://stackoverflow.com/")
Sign up to request clarification or add additional context in comments.

Comments

0

You must change your last line return text.encode('ascii', errors='replace').replace("?"," ") to return text.encode('ascii', errors='replace').replace(b"?", b" ") because after the encode() you're operating on bytes, and must replace bytes with other bytes.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.