How to encode it?

M

moneymakerXA2020-03-09 17:55:43

Python

moneymakerXA, 2020-03-09 17:55:43

How to encode it?

Suppose the data structure to represent a regular expression is as follows

reg_expr	:= { ‘key’ : ‘atm’, ‘val’ : token}
    | {‘key’ : ‘|’, ‘val’ : {‘fst’ : reg_expr, ‘snd’ : reg_expr}}
    | {‘key’ : ‘.’, ‘val’ : {‘fst’ : reg_expr, ‘snd’ : reg_expr}}
    | {‘key’ : ‘*”, ‘val’ : reg_expr}

Write a function that prints the corresponding regular expression under the following assumption:

Priority of the key is following ‘*’ > ‘.’ > ‘|’
‘*’ is postfix
‘.’ and ‘|’ are left associative

Tell me where to start, how to understand the reg_expr dictionary? I would be grateful for any advice or a link to a good article))

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

S

Sergey Pankov, 2020-03-11
@trapwalker

reg_expr is not a dictionary, but a grammar that describes an expression.
In Russian, it reads something like:
Regexp - THIS: EITHER
Atom
OR OperatorOr (with two arguments: the first is Regexp; the second is also Regexp)
OR OperatorPoint (with two arguments: the first is Regexp; the second is also Regexp)
OR OperatorAsterisk (with one argument , which is regexp).
Below is an important addition:

Priority of the key is following '*' > '.' > '|'
'*' is postfix
'.' and '|' are left associated

This means that the Asterisk Operator has the highest priority, followed by the Dot Operator, then the Or Operator.
The asterisk operator is postfix, that is, it is applied after its only argument.
OperatorPoint and OperatorOr are left-associative.
It looks like the author of the question needs to write a regular expression that will match a string that matches the described grammar.
Further, there can only be a solution to the problem, but the author did not ask about it, but only asked where to start.
I will not break his pleasure and spoil the result =)