6

Total python beginner here.

I have the following function, which checks if a string derived from certain inputs exists in a text file. It loops through each line of the text file, to see if an exact match is found.

I have to break out of looping, immediately after the match is found, to avoid needless looping.

Here is the code:

def DateZoneCity_downloaded_previously(Order_Date,ZoneCity):    #   function to check if a given DateZoneCity
                                                                    #   combination had already been completely downloaded
    string_to_match = Order_Date.strftime('%Y/%m/%d') + "-" + ZoneCity[0] + "-" + ZoneCity[1]
    with open(Record_File) as download_status:
        DateZoneCity_exists = False
        for line in download_status:
            if string_to_match in line:
                DateZoneCity_exists = True                      #   if match found, then set "DateZoneCity_exists" to True
                break                                           #   and break out from the [for line in download_status:] loop
        if DateZoneCity_exists: return True
    download_status.close()

I am searching for a more concise, pythonic way to structure the code. Is there anything I can do to make this better? Somehow to eliminate the need for "DateZoneCity_exists" and the second If statement?

6
  • 1
    This belongs to code review. Commented Jun 10, 2016 at 4:20
  • I'm sorry, should I be doing something? Am I supposed to move the thread, I dunno how to though. Commented Jun 10, 2016 at 4:50
  • @Chinmay Here are a few posts you may want to consider: A guide to Code Review for Stack Overflow Users and Be careful when recommending Code Review to askers Commented Jun 10, 2016 at 4:52
  • 1
    What is the download_status.closed doing at the end? Commented Jun 10, 2016 at 6:43
  • 1
    Your online example is wrong. download_status will be automatically closed when leaving the with block — that's the point of using the with block. Commented Jun 10, 2016 at 12:47

5 Answers 5

6

This feels like a situation where any would be the best solution:

# Function to check if a given DateZoneCity
def DateZoneCity_downloaded_previously(Order_Date, ZoneCity):
    # Combination had already been completely downloaded
    string_to_match = Order_Date.strftime('%Y/%m/%d') + "-" + ZoneCity[0]
                                                      + "-" + ZoneCity[1]
    with open(Record_File) as download_status:
        return any((string_to_match in line) for line in download_status)

Note that in this case it will return False on negative rather than your current implementation that will return None, also note that it does break out of the looping immediately upon finding a positive result so it does not require looping through the entire file either way.

Sign up to request clarification or add additional context in comments.

1 Comment

Nice. Very concise, Much appreciated.
4

Just return instead of break:

def DateZoneCity_downloaded_previously(Order_Date,ZoneCity):
    """Check if a given DataZoneCity combination had already been downloaded."""
    string_to_match = Order_Date.strftime('%Y/%m/%d') + "-" + ZoneCity[0] + "-" + ZoneCity[1]
    with open(Record_File) as download_status:
        for line in download_status:
            if string_to_match in line:
                return True
    return False  # No match found.

1 Comment

Oh, so as I understand, the return statement will also immediately do the job of breaking out, avoiding future loops, am I right?
3

Depending on how big your text file is, you can read it into a string, and just use that (easier and often faster than reading and checking line per line):

if string_to_match in open(Record_File).read():
    return True

In your example:

def DateZoneCity_downloaded_previously(Order_Date,ZoneCity):                   
    string_to_match = Order_Date.strftime('%Y/%m/%d') + "-" + ZoneCity[0] + "-" + ZoneCity[1]
    if string_to_match in open(Record_File).read():
        return True

Comments

2

You don't have to put a for loop here, See the updated code.

def DateZoneCity_downloaded_previously(Order_Date,ZoneCity):
    """Check if a given DataZoneCity combination had already been downloaded."""
    string_to_match = Order_Date.strftime('%Y/%m/%d') + "-" + ZoneCity[0] + "-" + ZoneCity[1]
    with open(Record_File) as download_status:
        if string_to_match in download_status.read():
            return True
    return False  # No match found.

2 Comments

I was concerned that without the for loop, the whole file would be loaded and it would take a lot of RAM. No?
Yeah, but if its a small file. there is no need to add a for loop.
0

If record_file is small then you can use in directly with ifstatement like:-

def DateZoneCity_downloaded_previously(Order_Date,ZoneCity):                   
    string_to_match = Order_Date.strftime('%Y/%m/%d') + "-" + ZoneCity[0] + "-" + ZoneCity[1]
    if string_to_match in open(Record_File).read():
        return True  

while if record_file is large then you have to iterate through each line for finding a match like:-

def DateZoneCity_downloaded_previously(Order_Date, ZoneCity):
    string_to_match = Order_Date.strftime('%Y/%m/%d') + "-" + ZoneCity[0]  + "-" + ZoneCity[1]
    with open(Record_File) as download_status:
        return is_empty((string_to_match in line) for line in download_status)  

because reading line by line will save storage memory used for saving line.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.