csvWriter using Python Selenium is not iterating through 0 to j - it is only taking the last hit and not each hit

Question

I am having trouble taking down all of the Xpath hits. I am telling it to take all of the elements from 0 to j (j=20) that is the length of the container for which there is an xpath hit for //[@id='tabs-1']/div[3]/table/tbody/tr[2]/td and for //[@id='tabs-1']/div[3]/table/tbody/tr[1]/td[3]. However, when it cycles through j it only seems to write the very last one into the csv file. Is this a problem with the way the csvWriter is coded? I want to take all of the hits and put them into separate rows in a csv file with each row having a hit for both path queries (spread across 2 columns) with each j having a separate row.

Also, how would I code it so that the csv adds to already existing rows when it cycles to the next page (for i in range (0, num_pages)) and repeats the process? Thanks for your help!

import sys
import csv
from selenium import webdriver
import time
import pandas as pd
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC 


 
# default path to file to store data
path_to_file = "/Users/D/Desktop/reviews.csv"

# default number of scraped pages
num_page = 3

# default tripadvisor website of hotel or things to do (attraction/monument) 
url = "https://www.tripadvisor.com/Attraction_Review-g187791-d192285-Reviews-Colosseum-Rome_Lazio.html"

# if you pass the inputs in the command line
if (len(sys.argv) == 4):
    path_to_file = sys.argv[1]
    num_page = int(sys.argv[2])
    url = sys.argv[3]

# import the webdriver
driver = webdriver.Safari()
driver.get(url)

# open the file to save the review
csvFile = open(path_to_file, 'a', encoding="utf-8")
csvWriter = csv.writer(csvFile)

# change the value inside the range to save more or less reviews

for i in range(0, num_page):
    name = []
    start=[]
    # expand the review
    time.sleep(2)
    container = driver.find_elements_by_xpath("//*[@id='tabs-1']/div[3]/table/tbody")
    
    for j in range(len(container)):
        name = container[j].find_element_by_xpath(".//tr[2]/td").text
        start = container[j].find_element_by_xpath(".//tr[1]/td[3]").text
        
# name of csv file  
        filename = path_to_file
    
# writing to csv file  
        with open(filename, 'w') as csvfile:  
    # creating a csv writer object  
            csvwriter = csv.writer(csvfile)   
    # writing the data rows  
            csvwriter.writerow([name, start])

        driver.find_element_by_xpath("//*[@id='tabs-1']/div[2]/a[@accesskey='n']").click()
     

driver.quit()

Lesmana · Accepted Answer · 2020-12-27 00:02:09Z

1

in each iteration you are overwriting the old contents of the file. that is why only the last iteration survives.

this line

with open(filename, 'w') as csvfile:

opens the file and truncates (removes) the content

to append use a instead of w.

see https://docs.python.org/3/library/functions.html#open

or for better performance open the file once outside of the loop.

with open(filename, 'w') as csvfile:  
    csvwriter = csv.writer(csvfile)   
    for j in range(len(container)):
        ...
        csvwriter.writerow([name, start])

this might not matter much because selenium is likely far slower than multiple opens. but it is always nice for your system if you use open sparingly.

edited Dec 27, 2020 at 0:02

answered Dec 26, 2020 at 23:53

Lesmana

27.3k12 gold badges84 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user7875084 Over a year ago

Thanks for your answer - what is a multiple open? How can I make it faster?

Lesmana Over a year ago

with multiple open i mean that you repeatedly open (and close) the same file to add a line. open is expensive. you should open once and then write all the lines then close. consider open like opening your garage to take a tool out. you do not open and close for every single tool. instead you open the garage once and keep open until you are done working and only then close the garage.

user7875084 Over a year ago

Thanks for your help. I changed it based on your advice, but am still having some additional issues. I've posted a new issue here if you have some time - thanks in advance: stackoverflow.com/questions/65461944/…

Collectives™ on Stack Overflow

csvWriter using Python Selenium is not iterating through 0 to j - it is only taking the last hit and not each hit

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related