How to parse lexemes for html template engine?

A

aki22020-08-15 20:44:39

JavaScript

aki2, 2020-08-15 20:44:39

I have a lexer class which is from text, for example:

%div.content
  %h1.headline Заголовок

returns me an array in tokens:

[
  {
    lexem: 'tag',
    value: { name: 'div', content: '', attributes: [Object] }
  },
  { lexem: 'indent', value: ' ' },
  {
    lexem: 'tag',
    value: { name: 'h1', content: 'Заголовок', attributes: [Object] }
  }
]

Syntax is like haml.
If one tag is nested in another, then indent is added to the tokens

. And I ran into a problem that I can’t describe this nesting normally in the parser.
The parser of these tokens should return the AST to me. Something like that:

{
  type: "root",
  nodes: [
    {
      type: "tag",
      value: {
        name: "div",
        attributes: {
          class: "content"
        },
        nodes: [ // вместо content, я пишу nodes (узлы), в которых будут лежать дочерние тэги
          {
            // и тут дочерние тэги, а в дочерних тэгах, могут быть еще тэги
          }
        ]
      }
    }
  ]
}

But I can't figure out how to make it so that if there was another in one tag, it would make nodes for the first tag, and the second would put it in the nodes of the first. It's confusing, but that's how it really is.
+ If there is another tag in the second tag, then the mess is complete.

Is there any token parsing algorithm for this kind of html templating engines?