D
D
dollar2019-04-01 14:10:22
Algorithms
dollar, 2019-04-01 14:10:22

Is there a known algorithm that parses expressions in complex languages ​​like JS and C?

For example, let's take only JavaScript and C ++, because the syntax is similar.
I know that in the general case it is done like this:
1) The string with the expression is tokenized, that is, it is split into successive pieces, where it is known exactly what each token is - a number, a bracket, an operator, a literal, etc.
2) The order of the tokens changes according to the Polish notation . This is such a kind of sorting that the sequence of calculations matches the order of the tokens in the array.
3) The count itself, where variables are replaced by their values, functions are called by name, etc.
The problem is that the Polish notation is too simple. It takes into account only operations with two parameters. And it does not take into account unary operators, that is, the expression1+ +1will already be wrong. It also doesn't take into account complex type operators A ? B : C. Functions with an arbitrary number of parameters are not supported, such as Math.max(a, b, c, d, и т.д.).
Is there a more general algorithm?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
Denis Zagaevsky, 2019-04-01
@dollar

Polish notation takes into account anything. In the sense that you write, it will be.
unary operators? Do two operations - UNARY_MINUS, MINUS. 1 1 UNARY_MINUS MINUS == 2
Complex operators? ABC TERNARY (not lazy? well, you can do it lazily)
Functions? abcd 4 max call. Here a, b, c, d, 4, max are arguments, they all go on the stack. The interpreter sees call, pops the function (max) from the stack, understands that this is a function with a variable number of arguments, pulls out this number (4), pulls out the remaining arguments by number, calls the max(abcd) function.
There may be instructions in Polize that control the flow of execution 1234 JUMP - moves the cursor to address 1234.
It all depends on your perversity, in short.
In order not to be unfounded, here is my pet project, there the calculation is just implemented on Polize.
The Polish notation has disadvantages - it is difficult to analyze the program, calculate types. Difficult to optimize. ASTs are better suited for this.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question