I have some problem with webscraping. I need data from betting site, scrape and store it at dataframe.
My code:
import numpy as numpy
import pandas as pd
from urllib.parse import urljoin
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
DRIVER_PATH = 'C:\\executables\\chromedriver.exe'
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1200")
driver = webdriver.Chrome(options=options, executable_path=DRIVER_PATH)
driver.get("https://www.nike.sk/live-stavky/futbal")
time.sleep(10)
soup = BeautifulSoup(driver.page_source, 'html.parser')
# match time
out_1 = soup.find_all(class_='ellipsis flex fs-10 c-black-50 justify-between pr-5')
# home and away teams
out_2 = soup.find_all(class_='ellipsis f-condensed c-black-100 text-extra-bold match-opponents pr-10')
# match status
out_3 = soup.find_all(class_='flex justify-center text-right flex-col match-score-col fs-12 c-orange text-extra-bold')
# match status 2
out_4 = soup.find_all(class_='flex justify-center text-right flex-col match-score-col fs-12 text-extra-bold c-default-light')
My output (out_1, ..., out_4) is messy blocks of text. How can I put it in a complete dataframe? Can I turn it to dataframe without regex?