R
R
r4khic2019-09-05 11:56:55
Python
r4khic, 2019-09-05 11:56:55

Requests.exceptions.TooManyRedirects: Exceeded 30 redirects how to fix the error again?

I am parsing this portal . The title, date and content of the news. And when parsing, I use the python 3.7 BS4 library
. And when parsing this portal, I get the following error:

Mistake
:
Traceback (most recent call last):
  File "C:/Users/Администратор/PycharmProjects/Task/parser.py", line 136, in <module>
    call_all_func(resources)
  File "C:/Users/Администратор/PycharmProjects/Task/parser.py", line 117, in call_all_func
    item_page = get_html(resource_link,encodings_rule)
  File "C:/Users/Администратор/PycharmProjects/Task/parser.py", line 14, in get_html
    r = requests.get(url)
  File "C:\Users\Администратор\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Users\Администратор\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\Администратор\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\Администратор\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 668, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "C:\Users\Администратор\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 668, in <listcomp>
    history = [resp for resp in gen] if allow_redirects else []
  File "C:\Users\Администратор\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 165, in resolve_redirects
    raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects, response=resp)
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.

I met with this problem, and even left my question here with the answer at random .
At the time I thought it was
Cyclic redirects. That is, the page refers to itself, and requests exceeds the request limit when trying to reach the final page.

And solved it like this:
The solution to the problem is to ignore the error. Capture her.

This time it seems to be different.
Here is the code that is used:
# < Получаем html код.
def get_html(url,encodings_rule):
    r = requests.get(url)
    try:
        if (len(encodings_rule) == 1):
            r.encoding = encodings_rule[0]
    except requests.exceptions.TooManyRedirects:
        pass 
    return r.text


From traceback, it became clear that the interpreter swears at this part of the code:
r = requests.get(url)

PS Comrades, I'm at a dead end.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
I
Ivan Yakushenko, 2019-09-05
@r4khic

Oh, those Kazakh webs...
There are no redirects there. Even if the redirect is disabled, the page still continues to load normally:

>>> r = requests.get('https://ru.egemen.kz/tag/kazakhstan', headers=headers, allow_redirects=False)
>>> r.status_code
200

Based on this, the problem is most likely on your end. Try to use a proxy, or just set the parameter allow_redirects=True(very unlikely that will help, but still).
You are not handling the exception correctly. It throws an exception for you r = requests.get(url), but you didn’t wrap it in a try-except block.
It needs to be like this:
from requests.exceptions import TooManyRedirects


def get_html(url,encodings_rule):
    try:
        r = requests.get(url)
        if (len(encodings_rule) == 1):
            r.encoding = encodings_rule[0]
    except TooManyRedirects as e:
        print(f'{url} : {e}')
    return r.text

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question