M
M
Maxim2015-04-08 15:43:04
Python
Maxim, 2015-04-08 15:43:04

Facebook parsing, what's wrong?

Help choosing href with facebook.

from grab import Grab

def main():
  g = Grab()
  g.go('https://www.facebook.com/login.php?login_attempt=1')
  g.doc.set_input('email', 'email')
  g.doc.set_input('pass', 'paswd')
  g.doc.submit()

  g.go('https://www.facebook.com')
  for elem in g.doc.select("//*[contains(@class, '_5pcq')]//@href"):
    print(elem)

if __name__ == '__main__':
  main()

Doesn't output anything at all.
Where could I be wrong?
Thank you!

Answer the question

In order to leave comments, you need to log in

3 answer(s)
M
Maxim, 2015-04-11
@Gadvain

The problem was solved by selenium + phantomjs.

L
lPolar, 2015-04-08
@lPolar

Absolutely nothing - is it?
1. Almost certainly there is AJAX on fb. You need to check which request is actually sent to the fb server during authorization. This can be done in firefox via firebug for example.
2. What does the page code look like in the login response?

A
Andrey, 2015-04-08
@andreypaa

First you need to make sure that the answer is received at all.
I think at least it is necessary to pass a bunch of standard headers, those that the browser sends.
As an example:
Accept image/png,image/*;q=0.8,*/*;q=0.5
Accept-Encoding gzip, deflate
Accept-Language ru-RU,ru;q=0.8,en-US;q=0.5, en;q=0.3
Connection keep-alive
Cookie
DNT 1
Host mc.yandex.ru
Referer Facebook parsing, what's wrong?
User-Agent Mozilla/5.0 (Windows NT 6.3; WOW64; rv:36.0) Gecko/20100101 Firefox/36.0
but sometimes this is not enough.
Perhaps you need to first make a request to the page with the form, get and pass hidden fields as well, etc.
It may also be necessary to execute some kind of javascript code, etc.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question