N
N
Nikolai Shepelev2015-07-15 18:38:06
PHP
Nikolai Shepelev, 2015-07-15 18:38:06

How to parse pages using PHP (the address of the pages is the same, but the content is different depending on the login entered)?

Briefly the essence:
There is a list of students' credit numbers, which are used to enter the site with their study results. These results should be parsed, combined and presented in a convenient form, on one page.
cda2147b717c4d3891ce66647c36cf37.pngNow everything in parts:

  1. There is a list of record numbers for entering the site.
  2. There is a page where the entrance to the site is carried out only by the account number ( domain/sign_in/ ). Further, the page from which we need to parse data ( domain/archive/ ), is available, of course, only after logging in, each number has its own data;
  3. We parse the data, combine it, display it on our website in a convenient form.

Now questions:
  1. How to implement an entrance to the site?
  2. How to "go" to the page with the data?
  3. And actually, the main thing is how to implement parsing, what tools / libraries to use?

Everything with the help of php(!)
Page structure, I think it's silly to describe, it's better if you see everything yourself.
Login page - goo.gl/2RFrN9 Data
page - goo.gl/jTQzhd (available after login)
Number - 13048050
P. S. I'm a beginner in this business, I know PHP superficially (you can say that I don't know at all). The main question is how to make an entry, and parsing?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
Stalker_RED, 2015-07-15
@venomkol

You can log in using curl , for example.

curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_POSTFIELDS, "a=4&b=7");

And parse with Zend\Dom . It can be taken as a separate module, there are only three files.
$doc = new Zend\Dom\Query($html, 'utf-8');
        $links = $doc->execute('ul.menu a');
        foreach ($links as $link) {
            $url = $link->getAttribute('href');
        ...

M
Muhammad, 2015-07-15
@muhammad_97

You can parse through this: https://github.com/imangazaliev/didom , easier and faster

B
Baha Rustamov, 2015-07-15
@by133312

1. Authorize and save cookies.
2. file_get_contents(''); or curl output and cut what you need.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question