What to do with the problem when parsing the site?

M

Maxim Vasilenko2019-10-14 15:41:17

Python

Maxim Vasilenko, 2019-10-14 15:41:17

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from random import choice

    
def main():
    url = '--url--'
    useragents = open('useragents.txt').read().split('\n')
    proxies = open('proxies').read().split('\n')

    proxy = {'http' : 'http://' + proxies[0]}
    useragent = {'User-Agent' : choice(useragents)}

    opts = Options()

    opts.add_argument("user-agent=" + str(useragents[0]))
    opts.add_argument('--proxy-server=http://' + proxies[0])
    driver = webdriver.Chrome(executable_path=r'--path to chromedriver--',chrome_options=opts)
    driver.get(url)
    print(driver.find_element_by_class_name('ip').text())


if __name__ == '__main__':
    main()

I use web-browser slenium to find free dates on the embassy website. After a few minutes of work,
the site does not start up and gives an error (Unable to access the site). User-agent's change, proxy purchased. How does the site determine that the bot is working?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

D

Dimonchik, 2019-10-14
@vasil3nk

by proxy IP, for example,
or not at all - it just blocks frequent attempts with IP