Answer the question
In order to leave comments, you need to log in
Python what means to parse an array of strings and extract the most used mask?
I need to parse an array of urls, and somehow extract a frequently used link mask. For example, I have the following URLs:
" lenta.ru/articles/2014/10/08/mosclassicgp "
" lenta.ru/photo/2014/10/07/longway "
" lenta.ru/photo/2014/ 10/03/misstuning "
" lenta.ru/photo/2014/08/27/nivajpg "
" lenta.ru/photo/2014/02/18/dynamic "
" lenta.ru/news/2014/10/08/nsxprice "
" lenta.ru/autosport "
Visual analysis shows that the most frequently used mask will be lenta.ru/photo <
I would like something similar by automated means, maybe there are some libraries for this, or, in extreme cases, some kind of algorithm.
Answer the question
In order to leave comments, you need to log in
You are unlikely to find specific libraries, but the algorithm is extremely simple:
# Критерии
def isdigits(str):
for i in str:
if not i.isdigit()
return False
return True
def istext(str):
# какая-то логика
token = ("type_of_token", "value_of_token", len("value_of_token"))
def process_link(link):
tokenlist = []
for i in link.split('/'):
if isdigit(i):
tokenlist.append(("digit", i, len(i))
if istext():
tokenlist.append(("text", i, len(i))
return tokenlist
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question