T
T
Talyutin2012-12-07 11:15:01
Regular Expressions
Talyutin, 2012-12-07 11:15:01

Can using RegExp be seen as accumulating technical debt?

We do not consider a one-time (like Find & Replace) use of regular expressions, but use in a project to solve common problems.

A brief introduction to the concept of technical debt at the links:
habrahabr.ru/post/119490/
techforum.mail.ru/report/64

Answer the question

In order to leave comments, you need to log in

6 answer(s)
K
KEKSOV, 2012-12-07
@KEKSOV

For example, if we are talking about using RE for HTML parsing (let's say for extracting links), then this is definitely bad, there are higher-level ready-made solutions for this.
Personally, I had such an experience with RE - it was necessary to organize the parsing of the logs of one large telephone exchange, it all started with the use of sed and everything was very good and fast, until it turned out that this piece of iron in the main message stream could asynchronously toss packets of other messages. As a result, the sed expression grew to more than a hundred lines and began to work for about 10 minutes, by itself only one person could figure it out. When my patience snapped, I sat down and rewrote everything in C using flex and bison, the program began to work for 10 seconds. I think this is a good example of how RE is involved in the accumulation of technical debt
But, on the other hand, if we are talking, say, about the validation of some user data, then why not.
I think that the summary can be like this - if the input data is complex in structure or the number of input options is large, then RE should not be used. If the data is simple (no more than one line), then RE is quite applicable to itself. Yes, one more thing, if the data is simple, but there are a lot of them (read - a highly loaded system that spends most of its time processing RE), then RE should not be used, it would be more profitable to write your own parser for a specific task.

K
KEKSOV, 2012-12-07
@KEKSOV

Found a good discussion . The correct idea is expressed there - do not use a hammer where a screwdriver is needed. This I mean that only experience will tell you which tool would be appropriate to apply to solve a particular problem.

U
un1t, 2012-12-07
@un1t

Without specific examples, it is not clear what is at stake, respectively, the answers may be opposite.
On one of my projects, I need to parse different xml files, and the files are quite large 200-600 MB. First, I chose the standard python solution - lxml.etree. One worked fine, but it turned out that not all files were formed correctly. There can be all sorts of errors like unclosed tags, and inside there can be any encoding, and not just the one specified in the xml header. Those. There can be a bunch of different errors in one file. In general, no standard solutions can handle all these problems. After searching for ready-made solutions, I wrote a regular parser. This parser does not care about any errors at all, it can also easily parse any broken files. In addition, it turned out that such a parser works 1.5 times faster than the lxml parser. In my case, the solution is adequate to the task.

L
LionAlex, 2012-12-07
@LionAlex

One day a programmer had a problem and wanted to solve it using regular expressions. Then the programmer had 2 problems.

E
egorinsk, 2012-12-07
@egorinsk

I think you put too much importance on the little things. No one forbids you to move especially suspicious code into a separate module or class, and in case of any serious problems, rewrite it using a different algorithm.
Or do you have a more specific example of the horrendous consequences of using regExp? Otherwise, the situation resembles a fairy tale about smart Elsa.

W
Wott, 2012-12-07
@Wott

Oh, it feels like you really like the “magic rules”, but to delve into the essence of desire is a bit. Tiama is not enough?
Regulars are a tool with a well-defined range of tasks.
Technical debt is a technique for accelerating the next phases of a project at the expense of subsequent ones.
These are completely different things.
Approximately the same as “passion for the Russian language increases the proportion of profanity in oral speech”
or “registration on habré / vk / steam worsens performance”
or more subtle “smoking shortens life expectancy” :)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question