How to implement authorization on the site for further parsing in python?

N

Nikolai Gavrilovich2018-03-30 15:30:37

Python

Nikolai Gavrilovich, 2018-03-30 15:30:37

Good to everyone. Tell me in words, code or links how to implement authorization on the site and further parsing pages in python.
There is: login, password, I pull scrf token through BeautifulSoup I give
it all using the request method post
Tokens checked through the debugger - identical
I get 400, it is impossible to check the data.
Tell me what could be the problem

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

N

Nikolai Gavrilovich, 2018-03-30
@nikgavrilovich

I post my solution, I don’t pretend to be a clean code, but it’s suitable as a crutch or help in understanding.

url_login = 'http://mysite.com/login'
url_main = 'http://mysite.com/'

client = requests.session()
html = client.get(url_login)
cookies = html.cookies.get_dict()
soup = BeautifulSoup(html.text, 'lxml')

login_csrf = soup.find('input', dict(name='_csrf'))['value']

payload = {
    'LoginForm[username]': '******',
    'LoginForm[password]': '******',
    'LoginForm[rememberMe]': 1,
    '_csrf': login_csrf
}

headers = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Encoding': 'gzip, deflate',
    'Accept-Language': 'ru-RU,ru;q=0.8,en-US;q=0.5,en;q=0.3',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)  Safari/537.36',
    'Referer': url_main,
    'Connection': 'keep-alive',
}

r = requests.post(url_login, cookies=cookies, data=payload, headers=headers)

print(r.status_code)
print(r.url)

S

spikejke, 2018-03-30
@spikejke

import requests
session  = requests.Session()
authorization = session.post('mysite/login', data = {'login':'admin','password':'secret'})

mysite/login - Your address for authorization
login , password - input forms