What if when parsing selenium finds an element but returns empty strings?

D

Danila Rumyantsev2021-12-26 11:37:07

Python

Danila Rumyantsev, 2021-12-26 11:37:07

I am parsing the site https://www.svyaznoy.ru/catalog/accessories/8936/6...
when I try to parse availability, it gives a list of empty lines, time.sleep() tried Xpath is correct, I don’t know what to do

from selenium import webdriver
from selenium.webdriver.common.by import By
import requests
from selenium.webdriver.chrome.options import Options
import time

class Parser():
    def __init__(self,url):
        options = webdriver.ChromeOptions()
        options.add_experimental_option("excludeSwitches", ["enable-automation"])
        options.add_experimental_option('useAutomationExtension', False)
        options.add_argument("--disable-blink-features=AutomationControlled")
        self.driver = webdriver.Chrome(options=options)
        self.url = url
    def parse(self):
        self.list_of_towns = ['moskva','sankt-peterburg','arhangelsk','vladivostok','volgograd','voronezh','ekaterinburg','izhevsk','irkutsk','kazan','kemerovo','krasnodar','krasnoyarsk','murmansk','naberezhnye-chelny',
                              'nizhniy-novgorod','novosibirsk','omsk','perm','rostov-na-donu','saratov','samara','sochi','surgut','tver','tolyatti','tula','tyumen','ulyanovsk','ufa','habarovsk','chelyabinsk','yaroslavl']
        self.cityes = []
        try:
            self.driver.get(self.url+'/availability')

            #self.button = self.driver.find_element(By.XPATH,'//a[@class="b-product-tabs-link _nowrap"]').click()


            self.driver.implicitly_wait(8)
            self.scroll = self.driver.find_element(By.XPATH, '//span[@class="shops-other-remains"]').click()
            html = self.driver.page_source
           




            self.location = self.driver.find_element(By.XPATH,'//div[@class="b-shops-map__address-text"]').text
            self.loc = self.location.split(',')
            self.loc = self.loc[0]
            self.location = self.driver.find_elements(By.XPATH, '//div[@class="b-shops-map__address-text"]')
            time.sleep(2)
            self.nalichie = self.driver.find_elements(By.XPATH,'//span[@class="b-tooltip-new s-tooltip _up _hover-mode"]')
            print([x.text for x in self.location])
            print([y.text for y in self.nalichie])






        finally:
            pass

p = Parser('https://www.svyaznoy.ru/catalog/accessories/8936/6270567/availability/moskva#mainContent')
p.parse()

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

D

Danila Rumyantsev, 2021-12-26
@Bubunduc

The .text field only returns the visible text. Here it appears after hovering the cursor. Here the text needs to be retrieved through the methodget_attribute('textContent')

self.driver.get(self.url + f'/availability')
self.shops = self.driver.find_elements(By.XPATH, '//span[@class="b-tooltip-new__text s-tooltip-text"]')
print([x.get_attribute('textContent') for x in self.shops])

S

soremix, 2021-12-26
@SoreMix

The .text field only returns the visible text. Here it appears after hovering the cursor. Here the text needs to be obtained through the method . And yet, it seems to me that it is better to parse each block separately. Then everything will be in place and no confusion. In general, all addresses are in the JS scriptget_attribute('textContent')

self.shops = self.driver.execute_script('return shopsObj')
for shop in self.shops:
    print(shop['address'], shop['availability_hint'])

old

First, find all lines that contain information about the store:

self.shops = self.driver.find_elements(By.XPATH, '//div[@class="b-shops-map__shop _other"]')

In the cycle, contact each and get the necessary information:

self.shops = self.driver.find_elements(By.XPATH, '//div[@class="b-shops-map__shop _other"]')

for shop in self.shops:
    print(shop.find_element_by_class_name('b-shops-map__address-text').text)
    print(shop.find_element_by_class_name('b-shops-map__availability-col-wrapper').find_element_by_class_name('b-tooltip-new__text.s-tooltip-text').get_attribute('textContent').strip())
    print()