0

I am trying to fetch the keywords from the article website. The website keywords look like this:

`This is the link:` `https://www.horizont.net/marketing/nachrichten/bgh-haendler-haftet-nicht-fuer-kundenbewertungen-auf-amazon-180980`

enter image description here

I am using this to fetch the keywords:

   Article_Keyword = bs.find('div', {'class':'ListTags'}).get_text()

and this is how what i am getting:

Themen Bundesgerichtshof Amazon Verband Sozialer Wettbewerb Kundenbewertung Tape dpa 

I need to get it by separating each keyword by comma. I can do it by RE but some keywords are with more than one word so i need that as one keyword.

is there any way to get each keyword by separating with comma?

3
  • You should lookup for a elements under your Article_keyword. Not sure this work, Article_Keyword.find_all("a") Commented Feb 20, 2020 at 9:36
  • It will work i guess also but i need a separator between them like coma Commented Feb 20, 2020 at 9:38
  • it will be a list, can do ",".join() to get elements in list as string separated by , Commented Feb 20, 2020 at 9:41

3 Answers 3

1

I used a child class element to Identify each element separately. I hope the below code helps.

from bs4 import BeautifulSoup as soup
from requests import get
url = "https://www.horizont.net/marketing/nachrichten/bgh-haendler-haftet-nicht-fuer-kundenbewertungen-auf-amazon-180980"
clnt = get(url)
page=soup(clnt.text,"html.parser")
data = page.find('div', attrs={'class':'ListTags'})
data1 = [ele.text for ele in data.find_all('a',attrs={'class':'PageArticle_keyword'})]
print(data1)
print(",".join(data1))

Output:

>> ['Bundesgerichtshof', 'Amazon', 'Verband Sozialer Wettbewerb', 'Kundenbewertung', 'Tape', 'dpa']
>> Bundesgerichtshof,Amazon,Verband Sozialer Wettbewerb,Kundenbewertung,Tape,dpa

Make sure you approve the answer if usefull.

Sign up to request clarification or add additional context in comments.

Comments

1

Try this:

Article_Keyword = bs.find('div', {'class':'ListTags'})
aes_Article_Keyword  = Article_Keyword.find_all("a")

s_Article_Keyword = ", ".join([x.text for x in aes_Article_Keyword])

Comments

1

try this

import requests
from bs4 import BeautifulSoup

url = 'https://www.horizont.net/marketing/nachrichten/bgh-haendler-haftet-nicht-fuer-kundenbewertungen-auf-amazon-180980'
page = requests.get(url)
soup1 = BeautifulSoup(page.content, "lxml")

Article_Keyword = soup1.find('div',{'class':'ListTags'}).find_all("a")
Article_Keyword = ", ".join([keyword.text.strip() for keyword in Article_Keyword])

print(Article_Keyword)

3 Comments

This will fails IF spaceblank in tag word
no. it will not fail, that case does not happen here. a blank tag will not appear in the string.
I means tag "Hello word" will be splited "Hello", "world" and desired is "one tag", "Hello World", "more tag"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.