2

I'm using a shell script to get the tracking information for a FedEx package. When I execute the script, I pass in the tracking number(a dummy number I found on the internet), and use curl:

#$1=797843158299
curl -A Mozilla/5.0 -b cookies -s "https://www.fedex.com/fedextrack/WTRK/index.html?action=track&action=track&action=track&tracknumbers=$1=1490" > log.txt

The output from the curl command is the HTML code, and the information I need is between the tag line:

<!--TRACKING CONTENT MAIN-->
<div id="container" class="tracking_main_container"></div>

Within the part is where I need to parse out the delivery information.
I am fairly new to scripting, and have tried some "| sed" suggestions I found online, but couldn't get anything to work.

4
  • I can see the html output of curl. What exactly should be the output/result of your script? Commented Dec 30, 2014 at 17:35
  • Probably the most robust approach is to use php's DOM parser. Though page scraping is always flaky. Commented Dec 30, 2014 at 17:36
  • Sorry? The tracking_main_container div is empty. Parsing its contents would give you an empty string. When the page is run in a browser, it's JavaScript that populates that div, and you're absolutely not going to be able to execute javascript from native bash without third-party tools. Commented Dec 30, 2014 at 17:41
  • ...now, if you want some suggestions re: such 3rd-party tools, I'd suggest using PhantomJS -- which will mean doing your scripting in JavaScript rather than bash. Commented Dec 30, 2014 at 17:43

1 Answer 1

1

This is not possible with curl or wget because the rendering final page is created with . It is possible to use another tools that are javascript capable like in or

This is a full working example to check if the status is delivered or not :

#!/usr/bin/python

useragent = "Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20100101 Firefox/7.0.1"

import spynner
from lxml import etree

browser = spynner.Browser(user_agent = useragent)
browser.create_webview(False)
browser.load("https://www.fedex.com/fedextrack/WTRK/index.html?action=track&action=track&action=track&tracknumbers=797843158299")
browser.wait_load()

reddit = etree.HTML(browser.html)

try:
    print reddit.xpath('//div[@class="statusChevron_key_status bogus"]')[0].text
except:
    print "Undelivered"

OUTPUT

Delivered
Sign up to request clarification or add additional context in comments.

3 Comments

If you're going to make an answer with the same information I already provided in a comment, you might as well try to make it a reasonably informative answer -- providing specific suggestions of non-bash tools that could provide JavaScript execution, f'rinstance. I've already suggested PhantomJS on that count.
Just edited with spynner in the same time of your comment
Bravo! Much more useful now. (I've been trying to do the same with Phantom, but was hitting a TypeError in Fedex's TrackingRootView minified js).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.