1

I am scraping a page for some data, however I need to insert text into a text box, submit the form and scrape the result page. I looked at the page source, but I'm not sure how to activate the button or pass down the argument for it.

Website is http://archive.org/web/web.php Trying to look at some historicals, and no idea what to use for this. Open to any solution

4
  • No idea, i updated the question. I am using Python for scraping Commented Mar 15, 2013 at 1:43
  • I am reading documentation on web and Python, nothing button wise so far. Hence i asked here Commented Mar 15, 2013 at 2:00
  • 2
    there are many ways to send http requests via python : use requests and you write up the parameters yourself or automate submission by something like mechanize Commented Mar 15, 2013 at 2:01
  • submitting text box with a button would be POST then? in requests Commented Mar 15, 2013 at 2:10

1 Answer 1

1

First you should know that click on that button usually does a POST to some urls, passes the data in that form, here is:

<form id="wwmform" name="wwmform" method="get" action="http://web.archive.org/form-submit.jsp" onsubmit="document.location.href='http://web.archive.org/web/*/'+document.getElementById('wwmurl').value;return false;" style="display:inline;">
      <input id="wwmurl" type="text" name="url" size="50" value="http://">
      <button type="submit" name="type" value="urlquery" class="roundbox5">Take Me Back</button>
    </form>

you see the action attribute? That's where the data goes to.

So in python, you may need urllib and urllib2 to encode the data and post it to the target url and then fetch the outcome.

ps: watch out the onsubmit

Sign up to request clarification or add additional context in comments.

6 Comments

what do you mean by watch out?
@rodling it handles the submit action and redirect to another url other than which the action indicates, which is by default.
@rodling so just fetch that url, see the code in onsubmit to know how to construct the url(finally in python of course)
well identified that I have to use that form, i am just unclear on how to use that block. I have no problem with constructing what goes into it, just no idea how to submit that form. Kinda new to this
params = urllib.urlencode({'id':"wwmform", 'name': "wwmform", 'method': "get", 'action':"web.archive.org/form-submit.jsp", 'onsubmit':"document.location.href='web.archive.org/web*/finance.yahoo.com/q/ae?s=MSFT'.value",'style': "display:inline;"}) i also tried with POST as a method but i get the initial page rather than one it should send me to
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.