V
V
Viktor Vsk2014-12-26 13:58:19
Parsing
Viktor Vsk, 2014-12-26 13:58:19

Would a declarative parsing method be convenient?

It would be convenient for someone to use a tool with which one could, for example, parse certain pages from habr, including internal content, using the following input data:

{
  "__url__": "'http://habrahabr.ru/page{{1,1,5}}/'",
  "posts": {
    "__iterator__": ".post",
    "name": "{{.post_title}}",
    "content": {
      "__follow__": "{{.post_title | first | attr_href}}",
      "post_body": "{{.content}}",
      "author": "{{.author > a}}",
      "comments": "{{#comments}}"
    }
  }
}

And at the output, get something like: pastie.org/9799295
I was looking for similar tools, I did not find it. The closest in meaning is, as I understand it, XSLT templates.
And is everything obvious in the input parameters? css\xpath selectors, paging, applying filters to the result.
If it really seems useful, please write, it will speed up the launch of the gem, writing the documentation and launching the test service

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Armenian Radio, 2014-12-26
@gbg

The coolest automatic page access syntax would be sql-like.

R
Rishat Kadyrov, 2014-12-26
@laska

Many sites generate a lot of things using js.
If your gem analyzes not stupidly the source code of the page, but climbs a completely generated house-tree, it will be very cool.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question