S
S
Sergey Lazarevic2021-11-21 22:29:10
Python
Sergey Lazarevic, 2021-11-21 22:29:10

How to get the redirect address?

Good day! Can you suggest how to get the redirect address from the Response Headers after a POST request?

import requests
from bs4 import BeautifulSoup

# Global Variables for this utility
HEADERS = {
    'authority':'jut.su',
    'method':'POST',
    'path':'/search/',
    'scheme':'https',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    # 'accept-encoding': 'gzip, deflate, br',
    # 'accept-language': 'en-US,en;q=0.9,ru;q=0.8',
    'cache-control': 'no-cache',
    'content-length': '43',
    'content-type': 'application/x-www-form-urlencoded',
    'origin': 'https://jut.su',
    'pragma': 'no-cache',
    'referer': 'https://jut.su/',
    'sec-ch-ua': '"Microsoft Edge";v="95", "Chromium";v="95", ";Not A Brand";v="99"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': "Windows",
    'sec-fetch-dest': 'document',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-user': '?1',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.53',
}
SERVICE = 'https://jut.su/'
SEARCH  = 'https://jut.su/search/'


print('AniTool Command Line Utility')
print('============================')


def isFound(object):
    if object:
        return "YES"
    else:
        return "NO"


def SearchAnime(name, returnDebugData):
    SearchData = {
        'makeme':'yes',
        'ystext':name,
    }
    client = requests.session()
    SearchRequest = client.post(SEARCH, data=SearchData, headers=HEADERS)
    soup = BeautifulSoup(SearchRequest.text, 'html.parser')
    dataDiv = soup.find("div", class_="anime_next_announce_msg_text")
    if returnDebugData:
        print(SearchRequest.text)

        print("\n")

        print('=============================')
        print("RESPONSE CODE: " + str(SearchRequest.status_code))
        print("URL: " + SearchRequest.url)
        print("ANIME FOUND:" + isFound(dataDiv))
        print("ENCODING: " + "FORM/XXX-APPLICATION")
        #print("LOCATION: " + SearchRequest.headers['location'])

        print("\n")
        
        print("========== HEADERS ==========")
        print(SearchRequest.headers)

        print("\n")

        print("========== HISTORY ==========")
        print(SearchRequest.history)

        print("========= FORM DATA =========")
        print(SearchData)
    else:
        return SearchRequest.text


anime = input("Enter anime name that you want to find: ")
print(SearchAnime("За гранью", returnDebugData=True))

After such a POST Request, a Response Header arrives:
{'Date': 'Sun, 21 Nov 2021 18:01:55 GMT', 'Content-Type': 'text/html; charset=windows-1251', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Vary': 'Accept-Encoding', 'X-Powered-By': 'PHP/5.6.40', 'X-Frame-Options': 'SAMEORIGIN', 'Set-Cookie': 'PHPSESSID=l6eieb1rmsplu11nnvensumto0; path=/; domain=.jut.su; HttpOnly', 'Pragma': 'no-cache', 'X-Page-Speed': '1.13.35.2-0', 'Cache-Control': 'max-age=0, no-cache', 'CF-Cache-Status': 'DYNAMIC', 'Expect-CT': 'max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"', 'Report-To': '{"endpoints":[{"url":"https:\\/\\/a.nel.cloudflare.com\\/report\\/v3?s=X1oCL0K5TrB3yk7w7IduMYTF3EZs6PEIyhIT0tWsR4oC8liVc4Wno5Be11ZRFODkJSxIFLhqejs4J7xIpNVo31vO89HZ0pglZ6wB03YoSjbh%2BXW5wyKZtw%3D%3D"}],"group":"cf-nel","max_age":604800}', 'NEL': '{"success_fraction":0,"report_to":"cf-nel","max_age":604800}', 'Strict-Transport-Security': 'max-age=31536000', 'Server': 'cloudflare', 'CF-RAY': '6b1bd376b94f7a71-DME', 'Content-Encoding': 'br', 'alt-svc': 'h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400'}

But the Response Header with the location parameter does not arrive in any way. Although, the status code of this request is 200, in history it is 302.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
soremix, 2021-11-21
@fafnir_dragon

You print the status code and it prints out 200 RESPONSE CODE: 200, doesn't it?
There is no location in the headers because you have already followed this link (if this is the case). Therefore, one 302 redirect hangs in history.
For example httpbin:

r = requests.get('https://nghttp2.org/httpbin/redirect-to?url=https%3A%2F%2Fgoogle.com&status_code=302')
print(r.url)
print(r.status_code)
print(r.headers)
print(r.history)

In urllies the final url, status_codeit also applies to it. requests by default follows all redirects by itself. If you don't want - your welcome
SearchRequest = client.post(SEARCH, data=SearchData, headers=HEADERS, allow_redirects=False)
print(SearchRequest.headers['location'])

ps: half of the headers can be removed
Как минимум
  • authority
  • method
  • path
  • scheme
  • cache-control
  • content-length
  • content-type
  • origin
  • pragma

да в принципе все удалить, можно оставить только accept и user-agent, и то, скорее всего, они не повлияют на результат

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question