H
H
Herben2022-04-09 20:47:21
Google
Herben, 2022-04-09 20:47:21

Can't parse Google SafeBrowsing, what's wrong?

I wanted to parse https://transparencyreport.google.com/safe-browsin... , a service through which you can check a domain or a link to a ban from Google,

but whatever I did, I got the output "[]", even user-agent added, thinking that I did something wrong, but when I simply displayed all the code through the same script, I got a completely different code, not the one that was during normal inspection, opening the source code in the browser, I understood why, in

other words through inspection one thing, through source code another

import requests
from bs4 import BeautifulSoup

url = 'https://transparencyreport.google.com/safe-browsing/search?url=discord-free.com'
headers = {
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:45.0) Gecko/20100101 Firefox/45.0'
      }
page = requests.get(url, headers = headers)

soup = BeautifulSoup(page.text, "html.parser")
data = soup.findAll(class_='material-icons ng-star-inserted')

print(data)


6251c0178e47a475352641.png
Above is the inspection where everything is.
6251c06db3e67868614170.png
Below is the initial one, it is his script that reads and does not find anything.
6251c5c1179ab019446360.png

Is it possible to somehow parse this?
Thanks in advance!

Answer the question

In order to leave comments, you need to log in

1 answer(s)
N
nokimaro, 2022-04-09
@Herben

because you need to parse to the page, and the XHR request that is executed when the page is loaded

https://transparencyreport.google.com/transparencyreport/api/v3/safebrowsing/status?site=discord-free.com

why not use the official api?
https://developers.google.com/safe-browsing/v4/loo...

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question