3

I'm basically trying to code a simple spell-check program that will prompt you for an input file, then analyze the input file for possible spelling errors (by using binary search to see if the word is in the dictionary), before printing them in the output file. However, currently, it outputs everything in the input file instead of just the errors... My code is as follows:

import re

with open('DICTIONARY1.txt', 'r') as file:
    content = file.readlines()
    dictionary = []
    for line in content:
        line = line.rstrip()
        dictionary.append(line)

def binary_search(array, target, low, high):
    mid = (low + high) // 2
    if low > high:
        return -1
    elif array[mid] == target:
        return mid
    elif target < array[mid]:
        return binary_search(array, target, low, mid-1)
    else:
        return binary_search(array, target, mid+1, high)

input = input("Please enter file name of file to be analyzed: ")
infile = open(input, 'r')
contents = infile.readlines()
text = []
for line in contents:
    for word in line.split():
        word = re.sub('[^a-z\ \']+', " ", word.lower())
        text.append(word)
infile.close()
outfile = open('TYPO.txt', 'w')
for data in text:
    if data.strip() == '':
        pass
    elif binary_search(dictionary, data, 0, len(data)) == -1:
        outfile.write(data + "\n")
    else:
        pass

file.close
outfile.close

I can't seem to figure out what's wrong. :( Any help would be very much appreciated! Thank you. :)

5
  • Are you using same code as given here? I'm getting syntax error with this code. I doubts, whether you tried running this code or not Commented May 18, 2015 at 7:55
  • Hi! Yes, I'm using the same code and it works fine for me. What syntax error are you getting? :O Commented May 18, 2015 at 7:57
  • input = input("Pleas.. Commented May 18, 2015 at 8:03
  • @Pynchia you would want to use raw_input instead, if you are on Pyhon2.x. Commented May 18, 2015 at 8:08
  • the prog works fine for me (Python 2.7). @Barum, thank you for the heads up on raw_input. Commented May 18, 2015 at 8:17

1 Answer 1

1

I tried replacing len(data) with len(dictionary) as that made more sense to me and it seems to work in my very limited tests.

I think you were passing the length of the word in question as the upper bound on the dictionary. So if you were looking up the word "dog" you were only checking the first 3 words in the dictionary, and since your dictionary is probably very large, almost every word was never found (so every word was in the output file).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.