E
E
Electr1k2021-01-26 20:53:25
Python
Electr1k, 2021-01-26 20:53:25

How to display code with browser extension?

There is a site and a browser extension to it, which adds its own block (div) to the site's HTML. I decided to parse this site in python. Having found nothing in the requests + bs4 functionality for working with extensions, I decided to parse using seleniuma. I connected the extension to the browser, but when receiving HTML, the program displays the source code, and not the code of the page with changes (browser extension), although the extension is displayed in the browser. Here is the code:

from selenium import webdriver
import os
from selenium.webdriver.chrome.options import Options

headers = {'user-agent':'*', 'accept':'*'}


executable_path = "chromedriver.exe"
os.environ["webdriver.chrome.driver"] = executable_path

chrome_options = Options()
chrome_options.add_extension('1.crx') #расширение 

driver = webdriver.Chrome(executable_path=executable_path, options=chrome_options)
driver.get("*url*")
html = driver.page_source
print(html)

I did not find methods for displaying code in the seleniuma documentation, but I found .page_source on the Internet.
I thought that the essence of the problem is in the .page_source method, and it displays the source code, replacing this method with code) the block with the extension is also not displayed, although it is displayed in the browser. That's actually the question why the block with the extension is not displayed and how to fix it?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
M
MinTnt, 2021-01-26
@MinTnt

You can still use requests, given that most of the messages that appear appear in XHR, I will add an example instruction later.
1) First, open the element's code, go to the Network, and in XHR, the loaded text is mainly displayed there.

picture
Kw2GU.png

2) Next, you can click Preview or Response to speed up the search for the desired script
picture
Kw2La.png

3) When you find the right one, go to Headers and get the method (get / post) as well as the url for the request
picture
Kw2Pw.png

4) Copy everything (except what comes with ":" at the beginning) this will be our headers
picture
Kw2YK.png

Before use, you will also need to format it for a dictionary, like
#accept: */*
head = {'accept': '*/*', ... }
5) If this is a Post request, then we also get the parameters from Payload
6 ) Repeat the request, with the method and data taken requests.get(url, headers = head)
or if the post is requests.post(url, headers = head, data = Payload)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question