1

I am trying to figure out how to only read in each line that is a url from a website, every time I run the code I get the error:

AttributeError: module 'urllib' has no attribute 'urlopen'

My code is below

import os
import subprocess
import urllib

datasource = urllib.urlopen("www.google.com")

while 1:
        line = datasource.readline()
        if line == "": break
        if (line.find("www") > -1) :
                print (line)


li = ['www.apple.com', 'www.google.com']
os.chdir('..')
os.chdir('..')
os.chdir('..')
os.chdir('Program Files (x86)\\LinkChecker')

for s in li:
    os.system('Start .\linkchecker ' + s)
2
  • are you using python 3.x or 2.7 ? Commented Jun 7, 2017 at 20:15
  • 3
    afaik urllib.urlopen is python2 ... in python3 try urllib.request.urlopen Commented Jun 7, 2017 at 20:15

3 Answers 3

1

This is very simple example.

This works in Python 3.2 and greater.

import urllib.request
with urllib.request.urlopen("http://www.apple.com") as url:
    r = url.read()
print(r)

For reference, go through this question. Urlopen attribute error.

Sign up to request clarification or add additional context in comments.

Comments

0

Seems python3X, so you should use

urllib.request.urlopen

2 Comments

Must be datasource = urllib.request.urlopen("http://www.google.com") (urllib.request.urlopen does not add "http://")
removed that part. OP will understand that once it works :)
0

The AttributeError was because it should be urllib.request.urlopen instead of urllib.urlopen.

Apart from the AttributeError mentioned in the question, I faced 2 more errors.

  1. ValueError: unknown url type: 'www.google.com'

    Solution: Rewrite the line defining datasource as follows where the https part is included:

    datasource = urllib.request.urlopen("https://www.google.com")

  2. TypeError: a bytes-like object is required, not 'str' in the line ' if (line.find("www") > -1) :`.

The overall solution code is:

import os
import urllib

datasource = urllib.request.urlopen("https://www.google.com")

while 1:
        line = str(datasource.read())
        if line == "": break
        if (line.find("www") > -1) :
                print (line)

li = ['www.apple.com', 'www.google.com']
os.chdir('..')
os.chdir('..')
os.chdir('..')
os.chdir('Program Files (x86)\\LinkChecker')

for s in li:
    os.system('Start .\linkchecker ' + s)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.