0

so I am converting JSON data from a url into a string, then writing it to a text file. This is my current Python script (I'm using Python 2.7.6):

import json
import urllib
import time

startTime = time.time()

url = "http://someurl..."
success = False

while (True):
    try:
        txt = urllib.urlopen(url).read()
        print "        -> open URL time: %.3f" % (time.time() - startTime)
        secondTime = time.time()

        textFile = open('data.txt', 'w')
        textFile.write("JSON Data (")
        textFile.write(datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
        textFile.write("):\n")
        textFile.write(txt)
        textFile.close()
        print "        -> write file time: %.3f" % (time.time() - secondTime)
        thirdTime = time.time()

        success = True
        break
    except ValueError as valueErr:
        print "Error:", err
    except IOError as ioError:
        print "Error: Internet connection issues."
        break

if (success):
    print "    -> data.txt created()."
    print "    -> Finished."
    print "        -> Total Elapsed Time = %.3f" % (time.time() - startTime), "seconds."
else:
    print "    -> Finished."

and the output is as follows (I am running it in Windows command prompt, not the Python prompt):

'getCryptsyData.py' executing...
        -> open URL time: 4.864
    -> data.txt created().
        -> write file time: 0.005
    -> Finished.
        -> Total Elapsed Time = 4.939 seconds.

My question is, is there any faster way of doing this? I.e. with a different python script or another scripting language or in C?

Edit 1: updated code and output to current script I am running.

6
  • 1
    Have you considered ditching json.loads() and json.dumps() and just writing jsonURL.read() to the file? Commented Jan 29, 2014 at 16:54
  • 2
    How about putting in more elapsedTime statements? If you put one after the URL call, and one after the len(), you can see where the time is being spent. It might turn out to be all spent on waiting for the external URL. Commented Jan 29, 2014 at 17:49
  • @AndrewEhrlich URL open takes 4.864 seconds, writeFile takes 0.005. Is there any way to speed this up without changing anything relating to internet connection? Commented Jan 29, 2014 at 18:01
  • 2
    Try time curl http://someurl... > data.txt from a shell prompt. If that takes the same time, which it probably will, then the problem is your network and no code change will help. That command will also show a progress meter including things like average data transfer rate, which you can compare to the theoretical capacity of your LAN. Commented Jan 29, 2014 at 18:16
  • 1
    (If that prints "curl: command not found" or equivalent, get it from here: curl.haxx.se Windows binaries are available if you scroll all the way to the bottom of their downloads page; I'd try the MSVC builds first (least likely to require you to install more stuff to make 'em work).) Commented Jan 29, 2014 at 18:17

2 Answers 2

2

Since the majority of the time is spent waiting for the external server to respond, you probably can't gain anything by changing your code. Depending on how this code is going to be used, you might be able to improve the overall experience by:

  • If the same files are likely to be requested again with no changes, cache them locally.
  • If the files are available on another server, find a mirror that is closer to you.
  • If the files are predictable in size, you could have another process that copies them locally on an interval.
Sign up to request clarification or add additional context in comments.

1 Comment

thanks for your answer and comments. "This answer is useful" (i.e. upvoted)
1

You are loading json from txt already. Why not skip that and just write the response txt to file?

your example could skip json load/dump, and basically be re-written as:

txt = urllib.urlopen(url).read()
with open('data.txt', 'w') as f:
    f.write(txt)

some style tips:

  • use a context manager ("with" statement) for writing to file.
  • for timing code blocks, check out the timeit module.
  • follow pep8. your camelCased var names hurt my eyes :)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.