Answer the question
In order to leave comments, you need to log in
How to bypass protection against bots on the site?
I welcome everyone! I am new to Python programming, and I decided to start by making a Telegram bot that would send new announcements from the site (I live in Turkey, so the Turkish site is https://www.sahibinden.com ). I started to study data parsing, and ran into a problem.
import requests
from bs4 import BeautifulSoup as Bs
url = "https://www.sahibinden.com"
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.105 YaBrowser/21.3.3.230 Yowser/2.5 Safari/537.36'}
response = requests.get(url, headers=headers)
html = response.text
with open('test.html','w') as fl:
fl.write(html)
print(response.status_code,'status_code')
<html>
<head>
<script type="text/javascript">
window.location.href = "https://www.sahibinden.com/olagan-disi-kullanim?c";
</script>
</head>
<body>
</body>
</html>
Answer the question
In order to leave comments, you need to log in
you are on the right track, but you are masquerading weakly,
start with Postman, without Python, get the right answers
I decided to go check. The protection is triggered because certain headers are missing from headers. Just add all existing ones, step by step discarding everything that does not affect.
Also, in your request, for example, important cookies are not transmitted. And since ads are not available on this site without authorization, it is not clear what exactly you are trying to achieve with an "empty" request.
Up . Well, in general, to bypass protection, just add to headers
'Upgrade-Insecure-Requests': '1'
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question