A
A
aqau1232021-12-07 15:12:10
Python
aqau123, 2021-12-07 15:12:10

How to implement validation?

year = soup.find('li', class_='CardInfoRow_year').find_all('span', class_ = 'CardInfoRow__cell')[1].find('a', class_ = 'Link').contents[0]

there is such a code.
let's say, in theory, something goes wrong, this element will not exist and it will spit out an error, but I don't need this, so I want to write a validation function. if everything is fine, then everything is fine and we return the text of the element, and if not, we return the string "-".
how i see this function
def validate(soupFunction(сюда мы передаем функцию супа. пример: soup.find('li', class_='CardInfoRow_year').find_all('span', class_ = 'CardInfoRow__cell')[1].find('a', class_ = 'Link')):
      #здесь мы ее вызываем
      try:
            item = soupFunction()
            return item.contents[0]
     except ... as e:
            return '-'

and it will be called something like this
year = validate(soup.find('li', class_='CardInfoRow_year').find_all('span', class_ = 'CardInfoRow__cell')[1].find('a', class_ = 'Link'))

how can this be implemented? I know there is a way to do this using lambda functions.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
V
Vindicar, 2021-12-07
@Vindicar

Well, you can use lambdas, but - does it make sense? What will be the gain compared to a simple try except?
It will be less readable, in my opinion.
Unfortunately, the soup does not support XPath out of the box - this thing would allow you to set the selector as in the example in one line of the form

//li[contains-token(@class, 'CardInfoRow_year')]//span[contains-token(@class, 'CardInfoRow__cell')][1]/a[contains-token(@class, 'link')]

You can make a rough similarity:
class BSItem: # одиночный элемент
    def __init__(self, item):
        self._item = start
    
    def __bool__(self):
        return bool(self._item)
    
    def __str__(self):
        return self._item.contents[0] if self else ''
    
    @property
    def tag(self):
        return self._item
    
    def __truediv__(self, other): # оператор / ищет один элемент
        if not self: # пустой элемент так и останется пустым
            return self
        if isinstance(other, str): # item / 'tag.classname'
            tag, _, cls = other.partition('.')
            if cls:
                return BSItem(self._item.find(tag, class_=cls))
            else:
                return BSItem(self._item.find(tag))
        if isinstance(other, int) and other == 0: # item / 0 == item
            return self
        if callable(other): # item / lambda tag: tag.has_attr('id')
            return BSItem(self._item.find(other))
        raise ValueError() # передали ерунду
    
    def __floordiv__(self, other): # оператор // ищет все элементы
        if not self:
            return BSItems([])
        if isinstance(other, str): # item // 'tag.classname'
            tag, _, cls = other.partition('.')
            if cls:
                return BSItems(self._item.find_all(tag, class_=cls))
            else:
                return BSItems(self._item.find_all(tag))
        if callable(other): # item // lambda tag: tag.has_attr('id')
            return BSItems(self._item.find_all(other))
        raise ValueError()


class BSItems: # коллекция элементов
    def __init__(self, items):
        self._items = items
    
    def __bool__(self):
        return bool(self._items)
    
    def __iter__(self): #позволяет делать for tag in BSItems:
        return iter(self._items)
    
    def __len__(self):
        return len(self._items)
    
    def __truediv__(self, other): # оператор / ищет один элемент в каждом элементе
        if not self: # пустой элемент так и останется пустым
            return self
        if isinstance(other, str): # item / 'tag.classname'
            tag, _, cls = other.partition('.')
            if cls:
                return BSItems([item.find(tag, class_=cls) for item in self._items])
            else:
                return BSItem([item.find(tag) for item in self._items])
        if isinstance(other, int): # items / 2 найдет третий элемент в коллекции
            return BSItem(self._items[other]) if len(self._items) > other else BSItem(None)
        if callable(other): # items / lambda tag: tag.has_attr('id')
            return BSItems([item.find(other) for item in self._items])
        raise ValueError() # передали ерунду
    
    def __floordiv__(self, other): # оператор // ищет все элементы
        if not self:
            return self
        if isinstance(other, str): # item // 'tag.classname'
            tag, _, cls = other.partition('.')
            result = []
            for item in self._items:
                result.extend(item.find_all(tag, class_=cls) if cls else item.find_all(tag))
            return BSItems(result)
        if callable(other): # item // lambda tag: tag.has_attr('id')
            result = []
            for item in self._items:
                result.extend(item.find_all(other))
            return BSItems(result)
        raise ValueError()

I cannot vouch for the accuracy of the code, but I must convey the idea.
An example usage would be something like this:
year_tag = BSItem(soup) / 'li.CardInfoRow_year' // 'span.CardInfoRow__cell' / 1 / 'a.Link'
    year = str(year_tag) or '-'

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question