P
P
Pavel_1321312021-06-08 12:34:33
Python
Pavel_132131, 2021-06-08 12:34:33

Response error [403] when using Fake Useragent?

I need to open links from the list "a" one by one, but the problem is that sooner or later it gives a Response [403] error due to a large number of hits. The sec.gov website states that the maximum number of requests should not exceed 10 requests per second. The first thing I tried was to set the time delay to 1 second, it did not help, I set a large random time delay, the result is the same. After that, I used the Fake Useragent library, but I still get a Response [403] error.
What could be causing this error?
Code attached below.

import requests
import time
from fake_useragent import UserAgent
import random

UserAgent().chrome

digit = random.randint(45,63)

a = [
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
'https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... ',
' https://www.sec.gov/Archives/edgar/data/2488/00000... '
]

for i in a:
time.sleep(digit)
page_link = i
response = requests.get(page_link, headers={'User-Agent': UserAgent().chrome})
print(response)

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
soremix, 2021-06-08
@SoreMix

It's pretty easy to distinguish between a normal URL visit and a curl/requests/etc request.
Akamai is standing there, he is known for his very aggressive attitude towards all kinds of automation and other inappropriate use of sites. Thank you, if you do not ban if you press F5 quickly. So I don’t think that the usual substitution of the user agent can be dispensed with here, it’s better then selenium in headless mode, but its akamai also successfully cuts it, although I think it depends on the site settings.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question