attribute with BeautifulSoup in Python?" />
N
N
Nikita Ivanov2020-05-30 15:25:48
Python
Nikita Ivanov, 2020-05-30 15:25:48

How to parse the target=_"_blank"> attribute with BeautifulSoup in Python?

I am training to study the BeautifulSoup library in python and have been fixated for the second day on parsing the attribute target=_"_blank">

What I want to do:
Parse the winners from the page => https://randstuff.ru/vkwin/zrnzt6/
I need a link to the page (Ready) and NAME SURNAME of the winner placed in the dictionary.

This is how the class looks like, in my case there are 5 of them. I pulled out the link and now I need a first and last name, located in target = "_blank">

<a class="name" href="https://vk.com/stush---" target="_blank">Стю-- Серге---</a>


I have already climbed the entire Internet and the code on the git, but I have not found a solution. Tell me how to solve :)

What we already have:
import requests
from bs4 import BeautifulSoup

URL = 'https://randstuff.ru/vkwin/zrnzt6/'
HEADERS = {'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.96 YaBrowser/20.4.0.3443 Yowser/2.5 Safari/537.36', 'accept': '*/*'}

def get_html(url, params=None):
    r = requests.get(url, headers=HEADERS, params=params)
    return r

def get_content(html):
    soup = BeautifulSoup(html, 'html.parser')
    items = soup.find_all('a', class_='name')[1:]
    vk = []
    for item in items:
        vk.append({
            'title': item.get('href'),
            #'name': item.select('a', 'target="_blank">').get_text ТУТ ДОЛЖЕН БЫЛ БЫТЬ ПАРСИНГ ИМЕНИ И ФАМИЛИИ 
        })
    print(vk)

def parse():
    html = get_html(URL)
    if html.status_code == 200:
        get_content(html.text)
    else:
        print('Error')

parse()

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
Sergey Karbivnichy, 2020-05-30
@n1k_ivanov

This is:

#'name': item.select('a', 'target="_blank">').get_text ТУТ ДОЛЖЕН БЫЛ БЫТЬ ПАРСИНГ ИМЕНИ И ФАМИЛИИ

replace with this:
'name': item.text
Output
[{'title': 'https://vk.com/stusha45', 'name': 'Стюша Сергеева'}, {'title': 'https://vk.com/id209266081', 'name': 'Юлия Сухова'}, {'title': 'https://vk.com/id394370251', 'name': 'Нина Ляшенко'}, {'title': 'https://vk.com/id473065083', 'name': 'Андрей Кротов'}, {'title': 'https://vk.com/id491175633', 'name': 'Тамара Петрова'}]

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question