0

I'm creating a script that reads information from search query on zhaopin.com using urllib2

When I try to open the url by copying it to my web browser (Chrome), I have no problem opening the site: http://sou.zhaopin.com/jobs/searchresult.ashx?p=1&isadv=0&bj=160000&in=160200

When I open the url using urllib2, I get the error moessage HTTPError: HTTP Error 502: Bad Gateway. From searching google, I could not figure out what I'm doing wrong.

import urllib
data = {}
data['in']='160200'
data['bj']='160000'
data['isadv']='0'
data['p']=1

url = 'http://sou.zhaopin.com/jobs/searchresult.ashx?'
url_values = urllib.urlencode(data)
full_url= url + url_values
print full_url
response = urllib2.urlopen(url)
html = response.read()
response.close()

Perhaps it is a problem with the URL: after opening the url in Chrome, the 'http://' disappears. I'd appreciate it if you could help me figuring this out.

2
  • Are you behind the Great Firewall of China? Try capturing the HTTP session using Wireshark and look at the raw data. The difference in the requests should be visible there. Commented Aug 6, 2017 at 7:34
  • That disappeared of http in address bar is nothing just a chrome feature nothing else. Commented Aug 6, 2017 at 7:42

2 Answers 2

1

Try urllib instead of urllib2:

response = urllib.urlopen(url)
html = response.read()
response.close()
Sign up to request clarification or add additional context in comments.

Comments

0
HTTP Error 502: Bad Gateway

The above error occurs when there is a misconfiguration in the server you are trying. The misconfiguration can be due to the server is rebooting or not available at that moment.

This error can also be a result of poor IP communication between back-end computers, possibly including the server at the site you are trying to visit. It may be that the server is overloaded.

You can use urllib itself in your code to open the URL.

import urllib
data = {}
data['in']='160200'
data['bj']='160000'
data['isadv']='0'
data['p']=1

url = 'http://sou.zhaopin.com/jobs/searchresult.ashx?'
url_values = urllib.urlencode(data)
full_url= url + url_values
print full_url
response = urllib.urlopen(url)
html = response.read()
response.close()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.