H
H
HexUserHex2020-07-06 13:18:30
CAPTCHA
HexUserHex, 2020-07-06 13:18:30

Selenium pass simple captcha?

Good afternoon,
there is one resource that most likely installed WAF BUT I could not determine what kind of WAF, in general ... a captcha pops up that I try to bypass in the following way.

1. Get the html content
2. If there is a captcha, then go through it 'manually' (by the way, the captcha is very simple and I think you can try to bypass it using selenium itself ...)
3. Save cookies to a file
4. Always use the saved ones for subsequent use cookies to avoid captcha

#!/usr/bin/env python3
import bs4, time, random, pickle
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

#https://github.com/mozilla/geckodriver/releases/download/v0.26.0/geckodriver-v0.26.0-linux64.tar.gz
#https://github.com/mozilla/geckodriver/releases/download/v0.26.0/geckodriver-v0.26.0-win64.zip
web_driver = r'geckodriver'


def get_content_selenium(www_url, cookie_save, hide):
  op = webdriver.FirefoxOptions()

  #Run firefox in hiden mode
  if hide:
    op.add_argument('--headless')
  
  driver = webdriver.Firefox(executable_path = web_driver, options = op)
  
  #Load cookies and get page
  if cookie_save:
    cookies = pickle.load(open('cookies_fnac.pkl', 'rb'))
    
    #Sent get request
    driver.get(url=www_url)
    
    #Set cookies
    for cookie in cookies:
      driver.add_cookie(cookie)
    
    #Refresh page
    driver.refresh()
      


  #Bypass captcha and save cookies
  else:
    driver.get(url=www_url)
    
    #Time for bypass captcha
    timeout = random.randint(15, 20)
    
    print('timeout: ', timeout)
    time.sleep(timeout)
  
    pickle.dump(driver.get_cookies() , open('cookies_fnac.pkl', 'wb'))
    

  #Get html source
  html = driver.page_source

  return html



def main():

    result = get_content_selenium('https://fnac.com', 1, 0)
    print(result)

if __name__== "__main__":
      main()


First I run, go through the captcha and save cookies:
result = get_content_selenium('https://fnac.com', 1, 0)


Next, I use the already received cookies with the passed captcha to access the content:
result = get_content_selenium('https://fnac.com', 0, 0)


But the captcha also pops up...

PS: If anyone can solve this problem as beautifully as possible, write the price of the question.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
G
Georg1000, 2021-04-22
@Georg1000

The same problem ... Did you solve the question somehow?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question