J
J
jerwright2021-07-21 09:44:30
Python
jerwright, 2021-07-21 09:44:30

Why can't I use selenium with Heroku?

Recently ran into a problem. Created a parser on selenium and bs4. I needed the bot to be able to click on the "Load more" button through selenium. The code works fine on my computer, but in the case of Heroku it gives errors.

Link to the page: https://vstup.osvita.ua/y2021/r5/111/848151/

Errors:
1) This error appears when you turn on the Heroku program. 2021-07-21T06:24:27.787246+00:00 app[worker.1]: /app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/firefox/firefox_profile.py:208: SyntaxWarning: "is" with a literal. Did you mean "=="?
2021-07-21T06:24:27.787261+00:00 app[worker.1]: if setting is None or setting is '':

2) This error appears after running the command.
2021-07-21T06:31:42.821687+00:00 app[worker.1]: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/ div[7]/div/div/div[3]/span"}
2021-07-21T06:31:42.821695+00:00 app[worker.1]: (Session info: headless chrome=91.0.4472.164)

Code:

def do_abitcheck(message, fio, URL=None):
    chrome_options = webdriver.ChromeOptions()
    chrome_options.binary_location = os.environ.get("GOOGLE_CHROME_BIN")
    chrome_options.add_argument("--headless")
    chrome_options.add_argument("--disable-dev-shm-usage")
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--proxy-server=138.128.91.65:8000")
    driver = webdriver.Chrome(executable_path=os.environ.get("CHROMEDRIVER_PATH"), options=chrome_options)

    try:
        driver.get(url=URL)
        time.sleep(2)
        driver.refresh()
        time.sleep(3)
        more_button = driver.find_element_by_xpath('/html/body/div[7]/div/div/div[3]/span')
        more_button.click()
        time.sleep(1)
    except UnexpectedAlertPresentException as ex:
        return bot.send_message(message.chat.id, "Виникла помилка під час підключення до сайту через постійних спроб зі сторони боту. Зачекайте, будь ласка, та спробуйте ще раз.")
    except Exception as ex:
        print(ex)
        return bot.send_message(message.chat.id, 'Виникла помилка під час підключення до сайту. Можливо, сторінка не була знайдена. Спробуйте ще раз та перевірте посилання.')
    finally:
        time.sleep(4)
        needed_html_code = driver.page_source
        driver.close()
        driver.quit()


Errors occur due to an unsuccessful attempt to click on the button, but they do not occur on the local computer.

The needed_html_code variable is responsible for copying the page's htmk code after other columns appear on the site by clicking on the button.

What can be changed in this situation, or is it possible to somehow force the bot to click on the site button through beautifulsoup4 or requests and copy the columns that appear?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
S
Sergey Gornostaev, 2021-07-21
@jerwright

First, you should not use an identity check where you need an equivalence check. Secondly, when the selector does not work, it is worth logging the content of the page. Most likely, the site was not developed by a fool, so it has a parser protection that returns a stub page for a request from Heroku.

N
Nick Sl, 2021-11-30
@nmkru

Were you able to solve the problem?
The first error on Heroku also crashes

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question