Parse data from the site?

T

tofel2019-11-08 21:06:31

Python

tofel, 2019-11-08 21:06:31

Hello everyone, tell me how and how to parse this site.

import requests

url =  "https://mobile.888.ru/sport/search?text=%D0%9B%D0%B8%D0%B2%D0%B5%D1%80%D0%BF%D1%83%D0%BB%D1%8C"

r = requests.get(url)

print(r.text)

Unfortunately, the site is not fully parsed, and links do not appear. I know that there is selenium, but I would like to know if there is another way to parse links to football teams.

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

T

tofel, 2019-11-09
@tofel

Apparently you can't do without selenium

from selenium import webdriver
from selenium.webdriver.firefox.options import Options
import time

options = Options()
options.headless = True

driver = webdriver.Firefox(options=options,executable_path=r"C:\geckodriver.exe")
driver.set_page_load_timeout (30)
driver.get('https://mobile.888.ru/sport/search?text=%D0%9B%D0%B8%D0%B2%D0%B5%D1%80%D0%BF%D1%83%D0%BB%D1%8C')
time.sleep(3)
html = driver.page_source
print(html)

D

Dimonchik, 2019-11-08
@dimonchik2013

see page code

F

FeNUMe, 2019-11-08
@FeNUMe

In requests, add headers similar to those sent by the browser, which receives the results in full.

V

vlsnake, 2019-11-09
@vlsnake

there is javascript, without selenium and the engine (minimum phantomjs, optimally chrome --headless) it will not be possible to parse
for FeNUMe:

# -*- coding: utf-8 -*-
import sys
if sys.hexversion < 0x03000000:
    from urllib import urlopen
    from urllib import URLopener
else:
    from urllib.request import urlopen
    from urllib.request import URLopener


useragent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36"
URLopener.version = useragent

text = urlopen('https://mobile.888.ru/sport/search?text=%D0%9B%D0%B8%D0%B2%D0%B5%D1%80%D0%BF%D1%83%D0%BB%D1%8C')
print text.read()