1

This might sound like a stupid question, but i am stuck at this step and badly need help. So I am trying to scrape this page [here][1] . This page makes a AJAX requests, whose header is as follows:

{
:authority: www.trip.com
:method: POST
:path: /restapi/soa2/16709/json/rateplan?testab=f14a7b98f0e497fd586c2946707d076a563a3d1d457219aa731e1cc518fe9df2
:scheme: https
accept: application/json
accept-encoding: gzip, deflate, br
accept-language: en-GB,en-US;q=0.9,en;q=0.8
cache-control: no-cache
content-length: 1661
content-type: application/json
}

In this header for the key "path" , the value of "testab" is generated dynamically. Now i want to get that value using only request method and not by opening the browser. So when i navigate to this url using request module, I want to be able to capture the value of "testab" dynamically. It would be a great help if anyone can provide me a solution. Thanks in advance. Also i cannot use scrapy for this purpose.

[1]: https://www.trip.com/hotels/mumbai-hotel-detail-762871/grand-hyatt-mumbai/?checkIn=2020-09-14&checkOut=2020-09-15&cityId=724&adult=2&children=0&ages=&crn=1&travelpurpose=0&curr=USD&showtotalamt=0&hoteluniquekey=H4sIAAAAAAAAAOPaycjFK8Fk8B8GGIWYOBilFjNyfJl7U12Iy9DE0sTczNzQwMhgCrNFs44jAwgcaHDwBDMKWh0CeCYxSnKCeef3OAiC6AbVnQ5OrBxr_SRYZjB-P663gpFxIyNEY5LDDkamE4x-C5j-PnnDvIuJleM1uwTTISA9SVCC5RQTwyUmhltMDI-YGF4xMXxiYvgFVdHEzNDFzDCJGaJuFjPDImYGIRaQG6UUjMxTjI0NE00tzYzMTSwT00B0qplJYpKxUXKiuaW5ArdG16GPv1iNGKyYpRjdPBiD2Iwd3SyMXKJkuJg9_YIE4xpqS16d2m4vxRwa7KKoqyj_JSdM2iGJNTVPNyIi4x1LAWMXI5MA4yRGTo7m3U8-Mp5gTAYA1R43aDgBAAA(

1
  • You can use BeautifulSoup to scrape the variables you are looking for, and then use them for your requests. Commented Sep 12, 2020 at 13:31

1 Answer 1

1

to accomplish this you can use curl

use pip and do pip install pycurl

then your code would look something like this (I copied code from documentation)

import pycurl
from StringIO import StringIO

buffer = StringIO()
c = pycurl.Curl()
c.setopt(c.URL, 'http://pycurl.io/')
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()

body = buffer.getvalue()
# Body is a string in some encoding.
# In Python 2, we can print it without knowing what the encoding is.
print(body)

documentation here

However I recommend using Selenium because doing a request will merely get you the html and not load dynamic content, then you'd also have to find how to parse the data.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.