M
M
Maksim Sobolev2021-06-15 20:53:10
Python
Maksim Sobolev, 2021-06-15 20:53:10

How to check forbidden and allowed words?

Hello!
If you don't mind, you can help.
The next task is a large text from 30 to 500 characters. (This is not html)
And there are a lot of such texts (Text in a variable)
How best to do and go through all the words ( Links can be found in words).
There is a list with allowed links, how to do a full check so that when I find the first permission found, I get true, but the check ends.
And if there are other links in the text that are forbidden or not in the list, I want to get False I
used the following code, but I need to check about 20 links
And I have to create a lot of such functions for each link, but it seems stupid to me.
If anyone can point me in the right direction.
Here is the function I used


def GLBmYlink(isData):
if re.findall(r' https://github.com/.\w+ ', isData):
return True
else:
return False

Answer the question

In order to leave comments, you need to log in

2 answer(s)
O
o5a, 2021-06-15
@sobolevmaksim

I still don't understand if regular expressions are needed. If the lists are just links, not link patterns, then you can use a simple link in text instead of regular expressions:

# blacklist, whitelist соответственно список запрещенных и разрешенных
def check(text):
    if any(link in text for link in blacklist):
        return False
    elif any(link in text for link in whitelist):
        return True
    return False

If you need regular expressions, you can combine them for the form 'link1|link2|link3', it can be faster with a large number of links, like this:
def check(text):
    if re.search('|'.join(map(re.escape, blacklist)), text):
        return False
    elif re.search('|'.join(map(re.escape, whitelist)), text):
        return True
    return False

S
soremix, 2021-06-15
@SoreMix

Any task is easily divided into subtasks
1. Find all links

links = re.findall(r'https://github\.com/\w+/\w+', isData)

2. Check if all found links are in the allowed list
all(link in allowed_links for link in links)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question