1

In Python I am doing a number of different string processing functions in a program. The user enters a term in a form and the term is processed through different functions. These include, stemming, stop word removal, punctuation removal, spell checking and getting synonyms.

Stemming is done using the stemming package,

stop word & punctuation removal using string.replace() and REGEX,

spell checking using pyEnchant

getting synonyms using the Big Huge Thesaurus API.

The term is sent to an API. The results are returned and put through a hard-coded sorting process. After all that the results are output to the user. The whole process takes over 10 seconds which is too long. I'm wondering if the fact that I am using many extensions, thereby importing them, causing the long delays.

Hope this isn't against the stackoverflow rules but I'm new to python and this is the kind of thing that I need to know.

2 Answers 2

4

I'm wondering if the fact that I am using many extensions, thereby importing them, causing the long delays.

Very unlikely. If you just import once, then call in a loop, the loop should take most of the time. (Or are firing up a Python process per word/sentence?)

As a rule of thumb, computer programs tend to spend 90% of their time executing 10% of the code. That part is worth optimizing. Things like import statements are usually not. To find out where your program is spending its time, use a profiler.

Sign up to request clarification or add additional context in comments.

1 Comment

Brilliant I didn't know about those profilers. I think the slowest aspect of it is the call to the Big Huge Thesaurus but would be very surprised if that was the overall problem. In my program the user only enters one word/phrase and it is processed. I think my code may be insufficiently structured though
1

Time how long each of the individual checks take. Then compare the results to see what is actually taking the most time.

import time
start = time.time()
#after the individual piece has completed
end = time.time()

print (end - start, "seconds")

It'd be interesting to actually know how long each component of the string processing is taking.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.