S
S
serj372019-09-24 06:20:57
Regular Expressions
serj37, 2019-09-24 06:20:57

How to remove abracadabra with a regular expression?

Good afternoon!
There is a task for processing logs. In each line, symbolic values ​​​​of numbers, eng. letters (different case) and special characters. It is necessary to separate relatively meaningful text from abracadabra (generated or just "garbage")
Yes:
moskvichhuev
89028091133
skoda582
za*a541_893z**
rfv%:t27l=bz
Madam_Vanilla Must
remain :
moskvichhuev
89028091133
skoda582
Madam_Vanilla (either a capital letter is not the first and not after a space) + characters from 3 character sets + letters _next_ no more than 3 - I assumed this as garbage or a generated one)?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
dollar, 2019-09-24
@dollar

Regular expressions work with relatively simple conditions.
Formulate what is "abracadabra" , then it will be possible to filter it.
Or, on the contrary, formulate what is the correct text, and only this can be left, and the rest is garbage, deleted.
Most likely, you will need a complex algorithm. Something like counting the number and variety of characters and the ratio of different types of characters. In this case regular expressions won't help.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question