1

I'm trying to count the number of times the word 'the' appears in two books saved as text files. The code I'm running returns zero for each book.

Here's my code:

def word_count(filename):
    """Count specified words in a text"""
    try:
        with open(filename) as f_obj:
            contents = f_obj.readlines()
            for line in contents:
                word_count = line.lower().count('the')
            print (word_count)

    except FileNotFoundError:
        msg = "Sorry, the file you entered, " + filename + ", could not be     found."
    print (msg)

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash   Course\\TEXT files\\dracula.txt'
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt'

word_count(dracula)
word_count(siddhartha)

WHat am I doing wrong here?

1
  • Nope. I tried incrementing using your line but I had to assign word_count before I increment it. So I added second line incrementing word_count with itself and it still gave me zero for both books. Commented Jul 31, 2016 at 2:08

4 Answers 4

3

You are re-assigning word_count for each iteration. That means that at the end it will be the same as the number of occurrences of the in the last line of the file. You should be getting the sum. Another thing: should there match? Probably not. You probably want to use line.split(). Also, you can iterate through a file object directly; no need for .readlines(). One last, use a generator expression to simplify. My first example is without the generator expression; the second is with it:

def word_count(filename):
    with open(filename) as f_obj:
        total = 0
        for line in f_obj:
            total += line.lower().split().count('the')
        print(total)
def word_count(filename):
    with open(filename) as f_obj:
        total = sum(line.lower().split().count('the') for line in f_obj)
        print(total)
Sign up to request clarification or add additional context in comments.

Comments

1

Unless the word 'the' appears on the last line of each file, you'll see zeros.

You likely want to initialize the word_count variable to zero then use augmented addition (+=):

For example:

def word_count(filename):
    """Count specified words in a text"""
    try:
        word_count = 0                                       # <- change #1 here
        with open(filename) as f_obj:
            contents = f_obj.readlines()
            for line in contents:
                word_count += line.lower().count('the')      # <- change #2 here
            print(word_count)

    except FileNotFoundError:
        msg = "Sorry, the file you entered, " + filename + ", could not be     found."
    print(msg)

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash   Course\\TEXT files\\dracula.txt'
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt'

word_count(dracula)
word_count(siddhartha)

Augmented addition isn't necessary, just helpful. This line:

word_count += line.lower().count('the')

could be written as

word_count = word_count + line.lower().count('the')

But you also don't need to read the lines all into memory at once. You can iterate over the lines right from the file object. For example:

def word_count(filename):
    """Count specified words in a text"""
    try:
        word_count = 0
        with open(filename) as f_obj:
            for line in f_obj:                     # <- change here
                word_count += line.lower().count('the')
        print(word_count)

    except FileNotFoundError:
        msg = "Sorry, the file you entered, " + filename + ", could not be     found."
        print(msg)

dracula = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\dracula.txt'
siddhartha = 'C:\\Users\\HP\\Desktop\\Programming\\Python\\Python Crash Course\\TEXT files\\siddhartha.txt'

word_count(dracula)
word_count(siddhartha)

Comments

1

Another way:

with open(filename) as f_obj:
    contents = f_obj.read()
    print("The word 'the' appears " + str(contents.lower().count('the')) + " times")

Comments

0
import os
def word_count(filename):
    """Count specified words in a text"""
    if os.path.exists(filename):
        if not os.path.isdir(filename):
            with open(filename) as f_obj:
                print(f_obj.read().lower().count('t'))
        else:
            print("is path to folder, not to file '%s'" % filename)
    else:
        print("path not found '%s'" % filename)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.