Return to Answer

added 22 characters in body

Source Link

edited Apr 28, 2020 at 14:45

71.2k
5
76
257

Minor perf improvements

These are unlikely to impact your performance in a material way, but they are performance improvements nonetheless:

re.search(r'[,!?{}\[\]\"\"\'\']',word_tokens[j])

recompiles the regex every time. re.compile() outside of your loops so that this does not happen.

Repeated concatenation such as this:

wordtocompare = wordtocompare+" "+word_tokens[j].lower()

can be a problem; strings in Python are immutable, so this is recreating a new string instance every time the concatenation is done. To avoid this, consider using StringIO or join a generator.

Other improvements

if not wordtocompare=="":

should be

if word_to_compare != "":

Also, wordtocompare.strip() is not being assigned to anything so it does not have any effect, currently.

Minor perf improvements

These are unlikely to impact your performance in a material way, but they are performance improvements nonetheless:

re.search(r'[,!?{}\[\]\"\"\'\']',word_tokens[j])

recompiles the regex every time. re.compile() outside of your loops so that this does not happen.

Repeated concatenation such as this:

wordtocompare = wordtocompare+" "+word_tokens[j].lower()

can be a problem; strings in Python are immutable, so this is recreating a new string instance every time the concatenation is done. To avoid this, consider using StringIO.

Other improvements

if not wordtocompare=="":

should be

if word_to_compare != "":

Also, wordtocompare.strip() is not being assigned to anything so it does not have any effect, currently.

Minor perf improvements

These are unlikely to impact your performance in a material way, but they are performance improvements nonetheless:

re.search(r'[,!?{}\[\]\"\"\'\']',word_tokens[j])

recompiles the regex every time. re.compile() outside of your loops so that this does not happen.

Repeated concatenation such as this:

wordtocompare = wordtocompare+" "+word_tokens[j].lower()

can be a problem; strings in Python are immutable, so this is recreating a new string instance every time the concatenation is done. To avoid this, consider using StringIO or join a generator.

Other improvements

if not wordtocompare=="":

should be

if word_to_compare != "":

Also, wordtocompare.strip() is not being assigned to anything so it does not have any effect, currently.

Source Link

answered Apr 16, 2020 at 16:06

Reinderien

71.2k
5
76
257

Minor perf improvements

These are unlikely to impact your performance in a material way, but they are performance improvements nonetheless:

re.search(r'[,!?{}\[\]\"\"\'\']',word_tokens[j])

recompiles the regex every time. re.compile() outside of your loops so that this does not happen.

Repeated concatenation such as this:

wordtocompare = wordtocompare+" "+word_tokens[j].lower()

can be a problem; strings in Python are immutable, so this is recreating a new string instance every time the concatenation is done. To avoid this, consider using StringIO.

Other improvements

if not wordtocompare=="":

should be

if word_to_compare != "":

Also, wordtocompare.strip() is not being assigned to anything so it does not have any effect, currently.