Getting an array iteration error, wondering how to fix it

Question

I am currently building a web scraper for Real Estate data. I'm working in Python and I've come across an error I can't seem to be able to fix.

for i in range(len(s)):
                        if '$' in s[i]:
                                price.append(s[i])

                        elif 'bath' in s[i]:
                                left = s[i].partition(",")[0]
                                right = s[i].partition(",")[2]
                                bed_bath.append(left)
                                sqft_lot.append(right)

                        elif 'fort collins' in s[i].lower():
                                address0 = s[i-1]+' '+s[i]
                                address.append(address0)

                        elif s[i].lower() == 'advertisement':
                                del s[i]

                        else:
                                continue

Value of 's' being:

                display = Display(visible=0, size=(800, 600))
                display.start()
                browser = webdriver.Firefox()
                browser.get(realtor.format(format))
                p = browser.find_element(By.XPATH, "//ul[@class='jsx-343105667 property-list list-unstyle']")
                content = p.text
                s = re.split('\n',content)

This is basically supposed to iterate through the array s, and add them to a separate array [price,bed_bath,sqrft_lot,address] to be used in a DataFrame. I know that it is indexing properly, I've printed each line consecutively using for i in range(len(s)): print s[i], which works, but then when I try to implement logic it's just breaking.

Getting error:

if '$' in s[i]:
**IndexError: list index out of range**

Any input into why this is happening would be much appreciated.

You seem to be removing elements with: del s[i]. Surely this affects the length of s and might mean that you run i off the end. — quamrana
– quamrana, Commented Feb 19, 2022 at 17:35
Did you mean to collect the offending indexes and remove them once this loop has finished? — quamrana
– quamrana, Commented Feb 19, 2022 at 17:39
Added the code declaring the 's' variable. Let me take a look but I believe @quamrana got it. What I might do instead is use a separate for loop to take care of 'advertisement' entries. — Samuel Troyer
– Samuel Troyer, Commented Feb 19, 2022 at 17:39
Ideally you would add a clear example of s as a python list, and not a code generating one, as we can't run that code. — Yuval.R
– Yuval.R, Commented Feb 19, 2022 at 17:42

Yuval.R · Accepted Answer · 2022-02-19 17:39:43Z

2

As @quamrana mentioned, most likely the problem is that you do del s[i], so s get's shorter and thus some indexes will no longer exist in s. I have 2 possible fix ideas. Fix 1:

for i in range(len(s)):
    if i >= len(s): # check if index is still in bounds
        break
    
    if '$' in s[i]:
            price.append(s[i])

    elif 'bath' in s[i]:
            left = s[i].partition(",")[0]
            right = s[i].partition(",")[2]
            bed_bath.append(left)
            sqft_lot.append(right)

    elif 'fort collins' in s[i].lower():
            address0 = s[i-1]+' '+s[i]
            address.append(address0)

    elif s[i].lower() == 'advertisement':
            del s[i]
    else:
            continue

Fix 2:

indexes_to_remove = []

for i in range(len(s)):
    if '$' in s[i]:
            price.append(s[i])

    elif 'bath' in s[i]:
            left = s[i].partition(",")[0]
            right = s[i].partition(",")[2]
            bed_bath.append(left)
            sqft_lot.append(right)

    elif 'fort collins' in s[i].lower():
            address0 = s[i-1]+' '+s[i]
            address.append(address0)

    elif s[i].lower() == 'advertisement':
            indexes_to_remove.append(i)
    else:
            continue


for index in indexes_to_remove[::-1]: # if you iterate through it backward, you won't have that problem.
    del s[i]

answered Feb 19, 2022 at 17:39

Yuval.R

1,2916 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Samuel Troyer Over a year ago

Second answer here would be the way I would do it if it were necessary. Appreciate the input.

Roman · Accepted Answer · 2022-02-19 17:42:39Z

0

create a new list from s and populate it with filtered and processed data

output_list = []
def process_data(value):
    # your code for processing data
    ...

for i in range(len(s)):
    if s[i] == some_condition(i):
         output_list.append(process_value(s[i])

answered Feb 19, 2022 at 17:42

Roman

175 bronze badges

Comments

Christian Weiss · Accepted Answer · 2022-02-19 17:49:03Z

0

You're deleting inside the for loop. The example here throws an error as well and is maybe easier to understand:

s = [i for i in range(5)]

for i in range(len(s)):
    print(f"{i=} with {s=}")
    del s[i]

Output:

IndexError: list assignment index out of range
i=0 with s=[0, 1, 2, 3, 4]
i=1 with s=[1, 2, 3, 4]
i=2 with s=[1, 3, 4]
i=3 with s=[1, 3]

answered Feb 19, 2022 at 17:49

Christian Weiss

1512 silver badges15 bronze badges

Collectives™ on Stack Overflow

Getting an array iteration error, wondering how to fix it

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related