S
S
Sergey2011-09-11 10:27:20
MySQL
Sergey, 2011-09-11 10:27:20

Regular expression to search for the form: word1 * (word2 -word3 -word4) * word5

Those. the question is, what would a regular expression look like that should look for consecutive sets of words, excluding a number of words within individual chunks of text?

For example, for a query of the form:

word1 * (word2 -word3 -word4) * word5

Corresponding strings will be:

word1 word99 word2 word98 word97 word5 word1
word2 word5 Inappropriate

:

word1 word99 word2 word98 WORD4 word97 word5
word1 WORD3 word2 word5

The second day I struggle with this problem ...
I would be very grateful if someone could help.

PS This is all written to search the MySQL database using the REGEXP operator. The option using Sphinx is not suitable.

Answer the question

In order to leave comments, you need to log in

5 answer(s)
I
Ivan Garbuz, 2011-09-11
@garbuzivan

^word1(?=.*word2)(?!.*word3|.*word4).*word5$
A regular expression where the string starts with the word word1 and ends with the word word5 . If there are words word3 or word4 between word1 and word5 , the string does not match the pattern. It is also necessary to meet the word word2 between the words word1 and word5 .

I
Ivan Garbuz, 2011-09-11
@garbuzivan

Regular expressions
Look at the block "Looking forward and backward" - there is about the exclusion of words, well, in general, re-read the entire page, everything is very easily painted, if anything - write in a personal.

I
Ivan Garbuz, 2011-09-11
@garbuzivan

PHP example:

if(preg_match("#^слово1.*(слово2|слово3|слово4).*слово5$#isU","слово1 слово99 слово2 слово98 слово97 слово5")){
  echo 1;
} else {
  echo 2;
}

if(preg_match("#^слово1.*(слово2|слово3|слово4).*слово5$#isU","слово0 слово99 слово2 слово98 слово97 слово5")){
  echo 1;
} else {
  echo 2;
}

if(preg_match("#(?<!Сергей )Иванов#isU","Игорь Иванов")){
  echo 1;
} else {
  echo 2;
}

if(preg_match("#(?<!Сергей )Иванов#isU","Сергей Иванов")){
  echo 1;
} else {
  echo 2;
}

G
gelas, 2011-09-11
@gelas

lookahead, in my opinion, is not supported in mysql
, I still can’t think of how to do this with one expression, but this option may suit:
SELECT * FROM tbl_words
WHERE txt REGEXP “word1.*(word2).*word5”
AND NOT txt REGEXP “word1. *(word3|word4).*word5"

I
Ivan Garbuz, 2011-09-11
@garbuzivan

if(preg_match("#^слово1(?!.*слово2|.*слово3|.*слово4).*слово5$#isU","слово1 слово99 слово4 слово98 слово97 слово5")){
  echo 1;
} else {
  echo 2;
}


if(preg_match("#^слово1(?!.*слово2|.*слово3|.*слово4).*слово5$#isU","слово1 слово99 слово54 слово98 слово97 слово5")){
  echo 1;
} else {
  echo 2;
}

As already said - everything works!

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question