-1

I would like to scrape some info from a website, but sometimes it's tricky cause continuous searching make the website able to detect the non-manual process, so I wanna input a time function to do seach like two search a time, then ten mins later, the second batch of seaching(the next two), but do not know how to use that func, my code like:

def get_keyword():
    data = pd.read_excel('a.xlsx')
    for d in data['name']:
        search(d)
if __name__ == "__main__":
    print('in progress')
    get_keyword()

Many thanks!!!

5
  • implicitly_wait is only for waiting for element in HTML, not for pausing code. You have to use time.sleep() for this. Commented Apr 30, 2021 at 5:22
  • if you want to wait 10 minutes then maybe you should use external programs (scheduler) to execute script every 10 minuts. On Linux you can use cron for this. Commented Apr 30, 2021 at 5:23
  • btw; you can use browser.implicitly_wait(10) once and it will remeber this value. Commented Apr 30, 2021 at 5:24
  • @furas, thanks for your help, implicitly_wait is used for website respond, time.sleep does not work well when I put after get_keyword func Commented Apr 30, 2021 at 6:31
  • you have to use it after search(d) - inside for-loop. OR inside search() . It will create pause after every search. If you want to pause after two search then you will have use variable to count loops and when it is 2 then run sleep and reset count` Commented Apr 30, 2021 at 6:45

2 Answers 2

1

If you want to make pause after every search then you should use time.sleep after search(d)

import time

def get_keyword():
    data = pd.read_excel('a.xlsx')
    for d in data['name']:
        search(d)
        time.sleep(10*60) # 10min * 60s

If you want to make pause after every two searches then you have to count loops. And when you count to 2 then run time.sleep and reset counter.

import time

def get_keyword():
    data = pd.read_excel('a.xlsx')

    count_loops = 0

    for d in data['name']:
        search(d)

        count_loops += 1
        if count_loops == 2:
            time.sleep(10*60) # 10min * 60s
            count_loops = 0

If it has to behave more like human then you should use random time

import time
import random


        if count_loops == 2:
            minutes = random.randint(8, 12)  # 8-12 minutes

            time.sleep(minutes*60)
            count_loops = 0

If you will countdown from 2 to 0 then you could even use random number of searches

import time
import random

def get_keyword():
    data = pd.read_excel('a.xlsx')

    count_loops = random.randint(1, 4)

    for d in data['name']:
        search(d)

        count_loops -= 1

        if count_loops == 0:
            minutes = random.randint(8, 12)  # 8-12 minutes
            time.sleep(minutes*60) # 10min * 60s

            count_loops = random.randint(1, 4)
Sign up to request clarification or add additional context in comments.

1 Comment

got it, really helps a lot! thank you so much for your help : )!
0

The sleep function is really easy, first you need to import this

import time
time.sleep(5)

The following code meaning the program will suspend for 5 seconds. Put this line at the place you want to stop. Also you can change the seconds for suspending.

1 Comment

thanks for your help! I put it after get_keyword func, but it still continuously working..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.