2

I am trying to scrape data from this website. The drop down menus populate based data entered, so I am making multiple post requests like this:

url = 'http://59.180.234.21:85/index.aspx'

with requests.Session() as session:
    response = session.get(url)
    soup = BeautifulSoup(response.content, "html5lib")
data = {
    'ddlDistrict': '165',
    '__VIEWSTATE': soup.find('input', {'name': '__VIEWSTATE'}).get('value', ''),
    '__EVENTVALIDATION': soup.find('input', {'name': '__EVENTVALIDATION'}).get('value', ''),
}
response = session.post(url, data=data)
soup = BeautifulSoup(response.content, "html5lib")

data = {
    'ddlDistrict': '165',
    'ddlPS': '11',
    '__VIEWSTATE': soup.find('input', {'name': '__VIEWSTATE'}).get('value', ''),
    '__EVENTVALIDATION': soup.find('input', {'name': '__EVENTVALIDATION'}).get('value', ''),
}
response = session.post(url, data=data)
soup = BeautifulSoup(response.content, "html5lib")

data = {
    'ddlDistrict': '165',
    'ddlPS': '11',
    'txtRegNo':'100',
    'ddlYear': '2011',
    '__VIEWSTATE': soup.find('input', {'name': '__VIEWSTATE'}).get('value', ''),
    '__EVENTVALIDATION': soup.find('input', {'name': '__EVENTVALIDATION'}).get('value', ''),
}
response = session.post(url, data=data)

After doing this the last page has a html table with a button which I can click and look at the report. I want to be able to simulate clicking the button and getting the response which then I can parse using BS. Please let me know how to be able to do it. Sample input, District: "New Delhi Distt", Police Station:"Con.Place", FirNo:"100", Year:"2011" will give you one Fir to view. The button has the following code:

onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("DgRegist$ctl03$imgDelete", "", true, "", "", false, false))"
2
  • 1
    "I want to be able to simulate clicking the button and getting the response(...)" - It looks like a task for selenium. Unless, of course, you could have the url beforehand. Commented Aug 8, 2017 at 21:06
  • Possible duplicate of Python click button with requests Commented Aug 8, 2017 at 21:09

1 Answer 1

4

If you can generate the http request the button is making, then you'll have the data you want. If the button is not making any requests then the data is already there somewhere and you just need to find it and parse it out.

EDIT:

In your case it's submitting the form data to a redirect to the same page. for this you would include the form data in the request to the page and it would have the resulting data in the response. For example:

import requests

headers = {
    'Origin': 'http://59.180.234.21:85',
    'Accept-Encoding': 'gzip, deflate',
    'Accept-Language': 'en-US,en;q=0.8',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
    'Cache-Control': 'max-age=0',
    'Referer': 'http://59.180.234.21:85/index.aspx',
    'Connection': 'keep-alive',
}

data = [
  ('__EVENTTARGET', ''),
  ('__EVENTARGUMENT', ''),
  ('__LASTFOCUS', ''),
  ('__VIEWSTATE', '/wEPDwUJMTQ2MDgwNjA1D2QWAgIDD2QWAgIFD2QWBAIBD2QWCGYPZBYEAgEPZBYCAgEPEGRkFgFmZAIDD2QWAgIBDxBkZBYAZAIBD2QWBAIBD2QWAgIBDxAPFgYeDURhdGFUZXh0RmllbGQFCENpdHlOYW1lHg5EYXRhVmFsdWVGaWVsZAUIQ2l0eUNvZGUeC18hRGF0YUJvdW5kZ2QQFREMLS0tU0VMRUNULS0tDUNFTlRSQUwgRElTVFQSQ1JJTUUgQU5EIFJBSUxXQVlTEEVBU1QgREVMSEkgRElTVFQJSUdJIERJU1RUD05FVyBERUxISSBESVNUVAtOT1JUSCBESVNUVBBOT1JUSCBFQVNUIERJU1RUEE5PUlRIIFdFU1QgRElTVFQLT1VURVIgRElTVFQLU09VVEggRElTVFQQU09VVEggRUFTVCBESVNUVBBTT1VUSCBXRVNUIERJU1RUElNQRUNJQUwgQ0VMTCBESVNUVA5TUFVXICYgQyBESVNUVAlWSUdJTEFOQ0UKV0VTVCBESVNUVBURDC0tLVNFTEVDVC0tLQMxNjIDMTY0AzE2OAMxNjkDMTY1AzE2NgMxNzMDMTcyAzE3NAMxNjcDOTU1AzE3MQM5NTQDOTUzAzE2MQMxNzAUKwMRZ2dnZ2dnZ2dnZ2dnZ2dnZ2cWAQIFZAIDD2QWAgIBDxAPFgYfAAUHUFNfTmFtZR8BBQdQU19Db2RlHwJnZBAVCQwtLS1TRUxFQ1QtLS0PQkFSQUtIQU1CQSBST0FEDUNIQU5BS1lBIFBVUkkKQ09OLiBQTEFDRQpFWEguIEdST1VOC01BTkRJUiBNQVJHClBULiBTVFJFRVQKVElMQUsgTUFSRwtUVUdMQUsgUk9BRBUJDC0tLVNFTEVDVC0tLQIwMgIwNwIxMQIxMgIxNQIyMgIzNQIzNhQrAwlnZ2dnZ2dnZ2cWAQIDZAICD2QWBAIBD2QWAgIBDw8WAh4JTWF4TGVuZ3RoAgRkZAIDD2QWAgIBDxBkDxYHAgECAgIDAgQCBQIGAgcWBxAFBDIwMTcFBDIwMTdnEAUEMjAxNgUEMjAxNmcQBQQyMDE1BQQyMDE1ZxAFBDIwMTQFBDIwMTRnEAUEMjAxMwUEMjAxM2cQBQQyMDEyBQQyMDEyZxAFBDIwMTEFBDIwMTFnZGQCAw9kFgRmD2QWAgIBDxBkZBYBAgNkAgIPZBYCAgEPDxYCHgdFbmFibGVkaGRkAgMPZBYCAgEPZBYCZg9kFgICAQ9kFgQCAQ88KwALAGQCAw88KwARAgEQFgAWABYADBQrAABkGAEFCmdyZG12dGhlZnQPZ2SPDrK3c7Ukzq5Wg/XtZSQMgDzEoWpRz8kXOVH1TO1LcA=='),
  ('__EVENTVALIDATION', '/wEdAC8iT6D3HjIr+ivdq0yBTgClCsHRaAEHIr772zKgggdQ+5cM7ByNsRG4qWi12q7B1tveFDGmjlPiBn9IJO8m9jt8W1Wcqc3FqlgV9EENz1OdJenvj2TG96ujSrFeprbtr3RTWKEdLSZa5NFLztoz81urAMmLvBzV7Qyb4qeGafdxuGr4cVZnct4CZh3KKsvt+xdAs0fg094ls2+uRMaFDPjjvXQmtkg7agsuhug+xMVSXXqKkbM01pitokD3Lzhr/+Zrc1JkJBoj+hAGr8ppVSNG4Yj6XkYB+ZGeix5+udiv9J9IjbG0sujSnR9YEqeLFuIKGVNDezkrxdfUawGK33AxvjAuIFmExdxunofmSVMj2KhPcg/6G9KkHuC16bwbWAqSNP2Vcw4/0wky0Un3Ssd3cGZtjtv+8Amihean2n5uODEqvswSsIcl9+U0P3atZA9gLfz10VlY0S1jS6520f4SrEv7IkN+08PXTozm9OT6/xtTbG8qE+XuugkwabaWLRSnp8pclR+ltj186j/FXuFQADgLnY9pn1HgIJ6W1oaeYRGUECgQhKzewPcXKgm68keQY5UuqQXqAyLatchak9gZ0UXh+krR/3fyyNtTnsY2m8PCEGuPl86vYAMVmqqL9lXoXDEtci8mednFEKQQYva+qH6WXxs8JPfC5HROATEan29Lv0JBrmCBZS2sro8ULkaKOxbg8uzVwdeGr6v29r+3doU6WdnwFP0DXPL1dqxkGAcZoyyvxsCvu30nzr6m7V8lgJSWBob7Dm8GjVgW5r9J4pnX0P+2bLZvBfOH/t4fWMmWiUd3VkQPcKR+pddTuBtpJk290kZ4wQ4JdvCFsSKdBaNizvIH0xP0v3ruMbsMtxjvy3Vie7D95PeNV8/hUPt4D+GqPsOH44Eo2T+LfQkxwBWNveA+4s3aFDJlbkXzUPNrXlzDLLAaZVBaziFS2sS3u5FK3YA3jSyXSEoDlVEvjtTdVzRZn7DFyWrI8V/OY49Qu8R8qTviVpgIZnzlz1HnUusdQsXU9clbfRlGQn3F'),
  ('ddlDistrict', '165'),
  ('ddlPS', '11'),
  ('txtRegNo', '100'),
  ('ddlYear', '2011'),
  ('txRegFromDt', ''),
  ('txRegToDt', ''),
  ('txtCompNM', ''),
  ('btnSearch', 'Search'),
]

response = requests.post('http://59.180.234.21:85/index.aspx', headers=headers, data=data)

print(response.content)
>>> b'\r\n\r\n\r\n<!DOCTYPE html PUBLIC ...... FIR No.</a></td><td style="width:10%;"><a href="javascript:__doPostBack(&#39;DgRegist$ctl02$ctl03&#39;,&#39;&#39;)" style="color:Black;">Fir Year</a></td><td style="width:10%;">FIR Date</td><td style="width:15%;">\r\n                                            View FIR\r\n                                        </td>\r\n\t\t\t</tr><tr class="DataItemStyle ">\r\n\t\t\t\t<td>0100</td><td>2011</td><td>29-05-2011</td><td>\r\n                                            <input type="image" name="DgRegist$ctl03$imgDelete" id="DgRegist_ctl03_imgDelete" src="Images/print.gif" ... ... \r\n</form>\r\n</body>\r\n</html>\r\n'
Sign up to request clarification or add additional context in comments.

10 Comments

the button has the following code onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(&quot;DgRegist$ctl03$imgDelete&quot;, &quot;&quot;, true, &quot;&quot;, &quot;&quot;, false, false))" not sure if that means I can generate the http request or not
If you open the browsers inspect and have the network tab open when you click the button, it'll make a visible request if it makes a request. If it does not make a request to anything then the data is likely there somewhere already like it is for sites such as facebook. Looking through the source code (not the code in the inspect) would tell you if the data is somewhere in the javascript.
In your case it's submitting the form data to a redirect to the same page. for this you would include the form data in the request to the page and it would have the resulting data in the response.
Thanks for the detailed response. The above code produces a network error. its missing a comma in line 21 as well. Dont we have to make post requests step by step as selecting one drop down item populates the other rather than making one post request.
The above code gives a network error because the form is incomplete. At each step of the dropdown it adds more to the form, so yes you can make each request incrementally, but if you have all the data being submitted in the final request you should be able to just submit all the data at once to retrieve the page you want.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.