Making dynamic ajax requests in Python

Question

This might sound like a stupid question, but i am stuck at this step and badly need help. So I am trying to scrape this page [here][1] . This page makes a AJAX requests, whose header is as follows:

{
:authority: www.trip.com
:method: POST
:path: /restapi/soa2/16709/json/rateplan?testab=f14a7b98f0e497fd586c2946707d076a563a3d1d457219aa731e1cc518fe9df2
:scheme: https
accept: application/json
accept-encoding: gzip, deflate, br
accept-language: en-GB,en-US;q=0.9,en;q=0.8
cache-control: no-cache
content-length: 1661
content-type: application/json
}

In this header for the key "path" , the value of "testab" is generated dynamically. Now i want to get that value using only request method and not by opening the browser. So when i navigate to this url using request module, I want to be able to capture the value of "testab" dynamically. It would be a great help if anyone can provide me a solution. Thanks in advance. Also i cannot use scrapy for this purpose.

[1]: https://www.trip.com/hotels/mumbai-hotel-detail-762871/grand-hyatt-mumbai/?checkIn=2020-09-14&checkOut=2020-09-15&cityId=724&adult=2&children=0&ages=&crn=1&travelpurpose=0&curr=USD&showtotalamt=0&hoteluniquekey=H4sIAAAAAAAAAOPaycjFK8Fk8B8GGIWYOBilFjNyfJl7U12Iy9DE0sTczNzQwMhgCrNFs44jAwgcaHDwBDMKWh0CeCYxSnKCeef3OAiC6AbVnQ5OrBxr_SRYZjB-P663gpFxIyNEY5LDDkamE4x-C5j-PnnDvIuJleM1uwTTISA9SVCC5RQTwyUmhltMDI-YGF4xMXxiYvgFVdHEzNDFzDCJGaJuFjPDImYGIRaQG6UUjMxTjI0NE00tzYzMTSwT00B0qplJYpKxUXKiuaW5ArdG16GPv1iNGKyYpRjdPBiD2Iwd3SyMXKJkuJg9_YIE4xpqS16d2m4vxRwa7KKoqyj_JSdM2iGJNTVPNyIi4x1LAWMXI5MA4yRGTo7m3U8-Mp5gTAYA1R43aDgBAAA(

You can use BeautifulSoup to scrape the variables you are looking for, and then use them for your requests. — Mooncrater
– Mooncrater, Commented Sep 12, 2020 at 13:31

Jpsh · Accepted Answer · 2020-09-12 11:12:45Z

1

to accomplish this you can use curl

use pip and do pip install pycurl

then your code would look something like this (I copied code from documentation)

import pycurl
from StringIO import StringIO

buffer = StringIO()
c = pycurl.Curl()
c.setopt(c.URL, 'http://pycurl.io/')
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()

body = buffer.getvalue()
# Body is a string in some encoding.
# In Python 2, we can print it without knowing what the encoding is.
print(body)

documentation here

However I recommend using Selenium because doing a request will merely get you the html and not load dynamic content, then you'd also have to find how to parse the data.

answered Sep 12, 2020 at 11:12

Jpsh

1,72612 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Making dynamic ajax requests in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related