1

I am trying to scrape weather report data of a particular region using BeautifulSoup4 in Python. Here's my code:

from bs4 import BeautifulSoup
import requests
import os
import sys


url = 'https://www.accuweather.com/en/in/guwahati/186893/weather-forecast/186893'
agent = {"User-Agent":'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'}
page = requests.get(url, headers=agent)
soup = BeautifulSoup(page.content, 'lxml') #= bs4 element
#print(soup.prettify())


#alldata is a tag of bs4 element
alldata = soup.find_all(class_='day-panel')


#This will give us all the required data we just need to arrange it nicely
datas = []
for h in alldata:
    datas.append(h.text.strip())

print(datas)
print(datas[0])

First print statement showing output as:

['Current Weather\n\t\n\n\t\t11:55 PM\n\t\n\n\n\n\t\t\t22°\n\t\t\n\n\t\t\t\tC\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t20°\n\t\n\n\t\tPartly cloudy', 'Today\n\t\n\n\t\t3/31\n\t\n\n\n\n\t\t\t34°\n\t\t\n\n\t\t\t\tHi\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t36°\n\t\n\n\t\tVery warm with hazy sunshine', 'Tonight\n\t\n\n\t\t3/31\n\t\n\n\n\n\t\t\t16°\n\t\t\n\n\t\t\t\tLo\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t16°\n\t\n\n\t\tPatchy clouds', 'Tomorrow\n\t\n\n\t\t4/1\n\t\n\n\n\n\t\t\t36°\n\t\t\n\n\t\t\t\t/ 16°\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t\n\t\n\n\t\tHot with hazy sunshine']

I want only the text, not in a list.
Second print statement showing output as:

Current Weather


        11:56 PM




            22°


                C



        RealFeel®
        20°


        Mostly clear

Expected output:

'Current Weather\n\t\n\n\t\t11:55 PM\n\t\n\n\n\n\t\t\t22°\n\t\t\n\n\t\t\t\tC\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t20°\n\t\n\n\t\tPartly cloudy'

How should I solve this issue?

1 Answer 1

1

The reason it is printing like that is that Python is making newlines and tabs for each \n and \t in the data. To ignore these escape characters when printing, use the Python repr function.

Like this:

print(repr(datas[0]))

The output:

'Current Weather\n\t\n\n\t\t12:28 AM\n\t\n\n\n\n\t\t\t71°\n\t\t\n\n\t\t\t\tF\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t69°\n\t\n\n\t\tMostly clear'
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.