0

I am trying to send data (string) into the amazon.com search box on the following xpath:

//input[@id="twotabsearchtextbox"] 

with requests with python. I want to be able to push any data in the search box, just like if I was a regular user. IE: typing in the search box: "apple watch".

Here is my code:

import requests
from lxml import html
url = "https://www.amazon.com/"
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ''AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'}

page = requests.get(url,headers=headers)
tree = html.fromstring(page.content)
search_box = tree.xpath('//input[@id="twotabsearchtextbox"]')
print(search_box)

I get a good response code: 200 when tested as well as the element from requests:

[<InputElement 1b7eaea45e8 name='field-keywords' type='text'>]

My question is how can I push data using the requests and not Selenium or Scrapy? Thanks

1 Answer 1

3

Getting search suggestions

Edit: OP seems to want search suggestions, here's how to do that.

You need marketplaceId mid, alias and prefix to make an AJAX request to suggestions endpoint. You can extract marketplace id from HTML using re.

You can find the original request by opening your browser's Developer Tools (F12) and switching to Network tab, then type in some text into the search box while you monitor the requests. You'll see a request made to completion.amazon.com.

from bs4 import BeautifulSoup
import requests
from urllib.parse import quote
import re
from pprint import pprint

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'}
def get_html(url: str) -> str:
    res = requests.get(url, headers=headers)
    res.raise_for_status()
    html = res.text
    return html

def get_marketplace_id(html: str) -> str:
    return re.search('obfuscatedMarketId:\s*"([^\"]+)"', html).group(1)

def get_suggestions(mid: str, keyword: str) -> list:
    url = f'https://completion.amazon.com/api/2017/suggestions?lop=en_US&mid={mid}&alias=aps&prefix={quote(keyword)}'
    res = requests.get(url, headers)
    res.raise_for_status()
    data = res.json()
    suggestions_raw = data['suggestions']
    suggestions = []
    for it in suggestions_raw:
        suggestions.append(it['value'])
    return suggestions

html = get_html('https://www.amazon.com')
mid = get_marketplace_id(html)
pprint(get_suggestions(mid, 'apple watch'))

output:

['apple watch band 38mm',
 'apple watch',
 'apple watch band 42mm',
 'apple watch charger',
 'apple watch band',
 'apple watch series 3',
 'apple watch series 4',
 'apple watch band 44mm series 4',
 'apple watch screen protector',
 'apple watch band 40mm series 4']


Getting search results

A simpler method would be creating a search url instead:

search_url = 'https://www.amazon.com/s'
page = requests.get(search_url, headers=headers, params={'k': 'apple watch'})

This will give you the search results directly, saving you a request.

Sign up to request clarification or add additional context in comments.

4 Comments

Hi @abdusco, that approach will not work, as I am trying to collect the suggestion from the autocomplete search box and not the actual details of the product, also that type of search will require to send a new request url for each keyword rather then getting the data from the same url. Thanks anyhow
You should amend your question to clarify that point @AmatoIlCiabattaro
You can't do that without running JS, which requires a proper browser, you need selenium for that unless you capture and mimic the AJAX request yourself.
@AmatoIlCiabattaro I've updated my answer for search suggestions

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.