A
A
Alexander Mekhonoshin2013-11-19 23:13:07
bash
Alexander Mekhonoshin, 2013-11-19 23:13:07

How to download a site with wget with cookies and userAgent?

There is a site on cms Moodle.
It has a page with links to various materials.
Access is limited to post authorization (login and password plain text, MoodleSession, MOODLEID_ cookies).
It is required to download the page along with all the materials.

I'm trying to use cookies and useragent taken from firefox after manual authorization like this (the toaster parser overdid it with " http://dl.avalon.ru/course/view.php?id=315 "):

wget --load-cookies=cook  --save-cookies=cook --keep-session-cookies --user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0"  "<a href="http://dl.avalon.ru/course/view.php?id=315">http://dl.avalon.ru/course/view.php?id=315</a>"

, where cook is a patterned cookie file (Netscape HTTP Cooke File) . This is exactly how I make it:
# HTTP cookie file.
# Generated by Wget on 2013-11-19 23:56:56.
# Edit at your own risk.

dl.avalon.ru    FALSE   /       FALSE   0       MoodleSession   jp98us55<***>vbbu7hj61
dl.avalon.ru    FALSE   /       FALSE   0       MoodleSessionTest       jma<***>WaY
dl.avalon.ru    FALSE   /       FALSE   1390074392      MOODLEID_       %25ED<***>%251CC%25B7d


Result: code 200 is returned, but the page, contrary to expectation, only contains "log in" and stuff like that.

Here is the command output to the console:
--2013-11-19 23:56:55--  <a href="http://dl.avalon.ru/course/view.php?id=316">http://dl.avalon.ru/course/view.php?id=316</a>
Распознаётся dl.avalon.ru... 195.209.230.144
Подключение к dl.avalon.ru|195.209.230.144|:80... соединение установлено.
HTTP-запрос отправлен. Ожидание ответа... 303 See Other
Адрес: <a href="http://dl.avalon.ru/login/index.php">http://dl.avalon.ru/login/index.php</a> [переход]
--2013-11-19 23:56:56--  <a href="http://dl.avalon.ru/login/index.php">http://dl.avalon.ru/login/index.php</a>
Повторное использование соединения с dl.avalon.ru:80.
HTTP-запрос отправлен. Ожидание ответа... 200 OK
Длина: 10228 (10.0K) [text/html]
Сохранение в каталог: ««[email protected]=316»».

100%[==============================================================================================>] 10 228      --.-K/s   за 0.006s

2013-11-19 23:56:56 (1.71 MB/s) - «[email protected]=316» saved [10228/10228]


After executing the command, the cookies are changed.

Authorization in firefox and downloading with wget is performed from one machine:
win7x64sp1 with cygWin64.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
B
BasilioCat, 2013-11-20
@a_ex

Log in (POST with required fields) via wget. To be safe, first GET the /login/index.php login page, it will set cookies, then POST
to /login/index.php with the content like "username=aaa&password=aaa&testcookies=1", as a result, information will be written to the session on the server side about a successful login, after which you can download everything else
PS: At night, there will be no one to pay attention to the increased server load.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question