0

I have some python code that retrieves data from an internal (intranet) site. It appears to work fine in that it retrieves the expected elements. For the sake of example (item1item2item3item4). The results do not have any spaces or newline between each element.

To remedy this, I thought I would use this approach (code snip):

# XPATH redacted
elements = driver.find_elements(By.XPATH,"/html/body/app-root/....")

for e in elements:
    elelist += e.text+"\n"
    count += 1
print(elelist)
print(f"There are {count} items.")

This results in the text items being returned like this: item1

item2

item3

item4

If I choose to append a comma instead of a newline, I get double commas as a result: item1,,item2,,item3,,item4

I believe this to be related as well, the value for "count" returns double what there actually is regardless of whether I append a newline or comma. So if there are 4 items, the count says there are 8. Using "elecount = elements.count", returns "<built-in method count of list object at 0x0000015B49B89100>"

Python is fairly new to me as is digging into a sites elements. Is there something I am not accounting for that does not show up when I inspect the page elements?

1 Answer 1

2

Seems like you have 'empty' elements in the list. Could be from styling/layout related divs without text. You could add a condition while printing to avoid them:

if len(e.text) > 0:
    elelist += e.text+"\n"

or filter the list beforehand:

elements = [e for e in elements if len(e.text) > 0]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.