M
M
Mikkkch2020-11-20 19:41:56
Python
Mikkkch, 2020-11-20 19:41:56

Selenium not working on heroku?

Hello, let's go straight to my settings.
In code:

from selenium import webdriver

op = webdriver.ChromeOptions()
op.binary_location = os.environ.get('GOOGLE_CHROME_BIN')
op.add_argument('--headless')
op.add_argument('--no-sandbox')
op.add_argument('--disable-dev-shm-usage')

driver = webdriver.Chrome(executable_path=os.environ.get('CHROMEDRIVER_PATH'), options=op)


heroku buildpacks:
  1. heroku/python
  2. https://github.com/heroku/heroku-buildpack-google-...
  3. https://github.com/heroku/heroku-buildpack-chromedriver


Keys:
  • CHROMEDRIVER_PATH=/app/.chromedriver/bin/chromedriver
  • GOOGLE_CHROME_BIN=/app/.apt/usr/bin/google-chrome


Error due to using driver.get() method:
selenium.common.exceptions.WebDriverException: Message: unknown error: net::ERR_CONNECTION_TIMED_OUT

Full traceback:

2020-11-20T16:37:52.441037+00:00 app[worker.1 ]: Task exception was never retrieved
2020-11-20T16:37:52.441073+00:00 app[worker.1]: future: exception=WebDriverException('unknown error: net::ERR_CONNECTION_TIMED_OUT\n (Session info: headless chrome= 87.0.4280.66)', None, None)>
2020-11-20T16:37:52.441075+00:00 app[worker.1]: Traceback (most recent call last):
2020-11-20T16:37:52.441076+00 :00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/dispatcher.py", line 388, in _process_polling_updates
2020-11-20T16:37:52.441077+00:00 app[worker.1]: for responses in itertools.chain.from_iterable(await self.process_updates(updates, fast)):
2020-11-20T16:37:52.441078+ 00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/dispatcher.py", line 225, in process_updates
2020-11-20T16: 37:52.441079+00:00 app[worker.1]: return await asyncio.gather(*tasks)
2020-11-20T16:37:52.441079+00:00 app[worker.1]: File "/app/.heroku /python/lib/python3.9/site-packages/aiogram/dispatcher/handler.py", line 117, in notify
2020-11-20T16:37:52.441080+00:00 app[worker.1]: response = await handler_obj.handler(*args, **partial_data)
2020-11-20T16:37:52.441081+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/dispatcher.py", line 246, in process_update
2020-11-20T16:37:52.441081+00:00 app[worker.1]: return await self.message_handlers.notify(update.message)
2020-11-20T16:37:52.441082+00:00 app [worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/handler.py", line 117, in notify
2020-11-20T16:37:52.441082+ 00:00 app[worker.1]: response = await handler_obj.handler(*args, **partial_data)
2020-11-20T16:37:52.441082+00:00 app[worker.1]: File "/app/main .py", line 72, in listening_to_links
2020-11-20T16:37:52.441083+00:00 app[worker.1]: await tasks_distribution(message.text, 60)
2020-11-20T16:37:52.441083+00:00 app[worker.1]: File "/app/main.py", line 49, in tasks_distribution
2020-11-20T16:37:52.441084+00:00 app[ worker.1]: data = await crawl_data(link)
2020-11-20T16:37:52.441084+00:00 app[worker.1]: File "/app/main.py", line 27, in crawl_data
2020-11 -20T16:37:52.441085+00:00 app[worker.1]: driver.get(link)
2020-11-20T16:37:52.441085+00:00 app[worker.1]: File "/app/.heroku /python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
2020-11-20T16:37:52.441085+00:00 app[worker.1]: self .execute(Command.GET, {'url': url})
2020-11-20T16:37:52.441086+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9 /site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
2020-11-20T16:37:52.441086+00:00 app[worker.1]: self.error_handler.check_response(response
) /app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
2020-11-20T16:37:52.441087+00:00 app[worker .1]: raise exception_class(message, screen, stacktrace)
2020-11-20T16:37:52.441087+00:00 app[worker.1]: selenium.common.exceptions.WebDriverException: Message: unknown error: net::ERR_CONNECTION_TIMED_OUT
2020-11-20T16:37:52.441088+00:00 app[worker.1]: (Session info: headless chrome=)


Just in case, I write that at the beginning of the launch (and I launch the application with the command heroku scale worker=1) I produces the following:
/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/firefox/firefox_profile.py:208: SyntaxWarning: "is" with a literal. Did you mean "=="?
if setting is None or setting is '':

Perhaps this will be useful.

And finally, the code that causes stagnation and subsequently an error:

async def crawl_data(link: str) -> Union[dict, None, str]:
    driver.get(link)
    # сюда поток все равно не доползает
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    # ....


async def tasks_distribution(link: str, wait_duration: int) -> None:
    while True:

        data = await crawl_data(link)

        if data == settings.PINNACLE_LATE:
            return None

        elif data:

            message_text = data['commands'] + '\n\n' + data['link']

            await parser.send_message(chat_id=settings.OWNER_ID, text=message_text)

            return None

        await asyncio.sleep(wait_duration)

@dp.message_handler()
async def listening_to_links(message: Message):
    await tasks_distribution(message.text, 60)


Help me please!

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question