B
B
Bjornie2017-08-20 15:18:36
Python
Bjornie, 2017-08-20 15:18:36

How to parse a page every 30 seconds and return the result in XML?

The task is to parse one page, or rather data in a specific table, which are regularly updated. The data is dynamic, i.e. the page is loaded via AJAX (looked through the Network tab, the request is made every 10 seconds). There is little data. Only 20 cells with numbers.
I want to make a simple parser on Selenium, but I'm wondering how to place it on a remote server so that parsing occurs at a given frequency, and how to organize access to the received data. Would it require some kind of micro-framework like Flask to do this?

Answer the question

In order to leave comments, you need to log in

4 answer(s)
E
Emil Revencu, 2017-08-20
@Revencu

Selenium - use last (if no other method works). Once you see the request (that is, you know its URL, Header, and of course cookies), try making requests through requests and parsing through LXML. You can run in a loop and knowing the start time to keep through time.slip ()

X
x67, 2017-08-22
@x67

in terminal:
in editor after all comments add:

* * * * * /usr/bin/python3 parser.py #запуск скрипта каждую минуту
* * * * * (sleep 30;  /usr/bin/python3 parser.py) #запуск скрипта каждую минуту с задержкой в 30 сек

The data is stored in the database, accessed by accessing the database. You can also raise the server on jupyter and use its means to process and visualize.

E
Evgen, 2017-08-25
@Verz1Lka

Make on scrapy. Parsers are easy to do, there is export to XML out of the box.

F
froststorm, 2017-08-27
@froststorm

The table is parsed in three rows using pandas

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question