Answer the question
In order to leave comments, you need to log in
Working with anagrams, how can you optimize the script?
The script works according to the following algorithm:
def anagrams(s):
n=len(s)
if n == 1:
return s
sb=[]
for i in range(n):
sb.append(s[i])
rval=anagrams(s[0:i]+s[i+1:n])
if isinstance(rval,str):
sb[i]=sb[i]+rval
if isinstance(rval,list):
c=sb[i]
sb[i]=[c + rval[x] for x in range(len(rval))]
if(n==2):
return sb
else:
c=[sb[x][h] for x in range(len(sb)) for h in range(len(sb[x]))]
return c
def find_word(list,source):
L = []
for item in list:
if item in source and item not in L:
L.append(item)
if L:
print('Matches found', len(L))
print(L)
else:
print('No matches found')
Answer the question
In order to leave comments, you need to log in
A function that returns a list of anagrams from a given wordI recommend taking a closer look at the itertools module, in particular, the permutations functions. Sample code:
import itertools
def anagrams(word):
for permutation in itertools.permutations(word):
yield ''.join(permutation)
for word in anagrams('car'):
print(word)
car
cra
acr
arc
rca
rac
>>> for word in anagrams('rar'):
print word
rar
rra
arr
arr
rra
rar
A function that compares two lists and returns matchesIf I understood correctly, you keep lists of anagrams and dictionary words in memory and look for (linear search) their intersection. This, in my opinion, is not very effective. I would do like this:
words=['car','arc','cat','map','toster']
wordset=set(words)
for word in anagrams('car'):
if word in wordset:
print ("word %s matched vocabulary" % word)
We enter the wordFrom the very beginning, for each word from the dictionary, you can remember its 'sorted' version:
From the entered word we make anagrams
words=['car','arc','cat','map','toster']
sortedwordset=set(''.join(sorted(w)) for w in words)
>>> sortedwordset
set(['acr', 'eorstt', 'amp', 'act'])
Then for each word entered, you can check whether it makes sense to make anagrams:if ''.join(sorted(word)) in sortedwordset:
#continue with anagrams
def sorted_string(s):
return ''.join(sorted(s))
words=['car','arc','cat','map','toster']
d={}
for word in words:
sorted_word=sorted_string(word)
if sorted_word in d:
d[sorted_word].append(word)
else:
d[sorted_word]=[word]
>>> d
{'acr': ['car', 'arc'], 'eorstt': ['toster'], 'amp': ['map'], 'act': ['cat']}
>>> d.get(sorted_string('car'),[])
['car', 'arc']
>>> d.get(sorted_string('cat'),[])
['cat']
>>> d.get(sorted_string('perkele'),[])
[]
The calculation of anagrams (permutations) is not required here (paragraphs 2 and 3 are misleading).
Therefore, the speed of execution of the code below does not depend on the length of the word.
Based on 20,000 words, it's instantaneous even with 'antidisestablishmentarianism':
from collections import Counter
from itertools import ifilter
def criteria(dictword):
return (
wlen == len(dictword) and
wset == set(dictword) and
wcnt == Counter(dictword)
)
while True:
word = raw_input('\nEnter word: ')
wlen, wset, wcnt = len(word), set(word), Counter(word)
with open('thesaurus.txt') as f:
thesaurus = (line.rstrip() for line in f)
for dictword in ifilter(criteria, thesaurus):
print dictword
if word in {'exit', 'quit'}:
break
You don't have to store all the anagrams in one array: do a permutation, check in the word base, then the next permutation. This will get rid of the OutOfMemory problem. But it will add a little time, as there will be repetitions.
Then you can break the word base into 28 separate bases, where only words of a certain length are stored. Should speed up the search a bit.
But how to make it faster - I would use another programming language (Go, C, Java) and try to parallelize the process into threads.
First, prepare your dictionary.
1) Create a new hash.
2) We sort through the entire dictionary.
2.1) We split the current word into letters, sort them, glue them together.
2.2) Check if there is such a key in the hash.
2.2.1) If not, then create hash[key] = [] (empty array)
2.2.2) If yes, then add the current word to the array.
3) Preparation is over. We save the current hash and use it everywhere.
4) Comparison function: we take a word, we break it by letters, we sort it, we stick it together.
5) We check the presence of this key in our dictionary. We output the result.
JS example: jsfiddle.net/6bz8g9gz Python
example:
dict = ['wolf','flow','hello','world','folw','jack','open','close','nepo','peno','kill'];
def prepare(d):
hash = {};
for v in d:
v = v.lower()
list = [c for c in v]
list.sort()
result = "".join(list)
if hash.has_key(result):
hash[result].append(v)
else:
hash[result] = [v]
return hash
def test(word, d):
word = word.lower()
list = [c for c in word]
list.sort()
result = "".join(list)
if d.has_key(result) and len(d[result]) != 0:
return d[result]
else:
return False
d = prepare(dict)
print test("WolF", d)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question