R
R
Rahat2020-04-14 12:26:14
Python
Rahat, 2020-04-14 12:26:14

How to test a search engine parser?

The site for which this code was created is ---> gadget.kg I
can't get it to accept html and start doing a check

import unittest
from .transform import GadgetTransform

class TestExtractor(unittest.TestCase):

    def test_data_products(self):
        transform = GadgetTransform()
        product_page = 'http://www.gadget.kg/catalog/search?q=xiaomi+redmi+note+7'
        product_details = transform.get_data(html=product_page)
        expection_product_details = [
            {'Title': 'Мобильный Телефон Xiaomi Redmi Note 7 PRO (6+128Gb)', 'cost': '12.000 сом'},
            {'Title': 'Мобильный Телефон Xiaomi Redmi Note 7 (6+64Gb)', 'cost': '10.400 сом'},
            {'Title': 'Мобильный Телефон Xiaomi Redmi Note 7 (4+128Gb) EU', 'cost': '11.500 сом'},
            {'Title': 'Мобильный Телефон Xiaomi Redmi Note 7 (3+32Gb) EU', 'cost': '10.000 сом'},
            {'Title': 'Мобильный Телефон Xiaomi Redmi Note 7 (4+64Gb) EU', 'cost': '10.700 сом'}]

        self.assertEqual(expection_product_details, product_details)

The code in which the site is parsed
from bs4 import BeautifulSoup

class GadgetTransform:

    def get_data(self,  html: str) -> list:
        soup = BeautifulSoup(html, 'html.parser')
        items = soup.find_all('div', class_='hit__slide')
        phone = []
        for item in items:
            price = item.find('span', class_='hit__slide__price')
            if price:
                price = price.get_text()
            else:
                price = ''
            phone = [i for i in phone if i['cost'] != '']
            phone.append({
                'Title': item.find("h6", class_='hit__slide__title').get_text(),
                'cost': price,
            })

        print(phone)

Answer the question

In order to leave comments, you need to log in

2 answer(s)
D
Dimonchik, 2020-04-14
@ARHAT-99

decompose bro
check - does the code
come if it came - check css classes
and not ssy

R
Rahat, 2020-04-14
@ARHAT-99

And what does this mean, do I get an incorrect comparison or is it wrong in the code?

[{'cost': '13.000 сом',
  'title': 'Мобильный Телефон Xiaomi Redmi Note 8 (6+128Gb) Global IND'},
 {'cost': '15.800 сом',
  'title': 'Мобильный Телефон Xiaomi Redmi Note 8 PRO (6+128Gb) Global IND'},
 {'cost': '10.300 сом',
  'title': 'Мобильный Телефон Xiaomi Redmi Note 8 (3+32Gb) EU'},
 {'cost': '12.900 сом',
 'title': 'Мобильный Телефон Xiaomi Redmi Note 8 (4+128Gb) EU'},
 {'cost': '15.900 сом',
  'title': 'Мобильный Телефон Xiaomi Redmi Note 8 PRO (6+128Gb) Global EU'},
 {'cost': '15.400 сом',
  'title': 'Мобильный Телефон Xiaomi Redmi Note 8 PRO (6+64Gb) EU'},
 {'cost': '11.700 сом',
  'title': 'Мобильный Телефон Xiaomi Redmi Note 8 (4+64Gb) EU'}] != None

<Click to see difference>

Traceback (most recent call last):
  File "/snap/pycharm-community/179/plugins/python-ce/helpers/pycharm/teamcity/diff_tools.py", line 32, in _patched_equals
    old(self, first, second, msg)
  File "/usr/lib/python3.6/unittest/case.py", line 829, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/usr/lib/python3.6/unittest/case.py", line 822, in _baseAssertEqual
    raise self.failureException(msg)
AssertionError: None != [{'cost': '13.000 сом', 'title': 'Мобильн[595 chars]EU'}]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/usr/lib/python3.6/unittest/case.py", line 605, in run
    testMethod()
  File "/home/rahat/projects/products-aggregator/gadgetkg/transform_test.py", line 26, in test_data_products
    self.assertEqual(product_details, expection_product_details)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question