NumPy how to rewrite algorithm?

K

kAIST2021-04-04 20:38:05

Python

kAIST, 2021-04-04 20:38:05

At the input, one numpy array and also a lot of arrays with which it needs to be compared and give out those that are the most "similar". It is done like this:

input_array= np.array(....)
many_arrays=[np.array(....),np.array(....), .... ]
dists = np.linalg.norm(many_arrays-input_array, axis=1) 
ids = np.argsort(dists)[:20] #получаем первые 20 индексов максимально "похожих" на input_array

But what if input_array is not one array, but N arrays? That is, you need to get a list of arrays that are as similar as possible to all of the input_array?
I want the most elegant solution, suddenly it is done in a couple of lines)

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

D

dmshar, 2021-04-04
@dmshar

Well, in a couple or not - consider for yourself:

import itertools
import numpy as np
many_arrays=[np.array([1,2,3]),np.array([4,5,6]), np.array([7,8,9]) ]
many_arrays2=[np.array([1,2,3]),np.array([4,5,6]), np.array([7,8,9]) ]
prd=itertools.product(many_arrays,many_arrays2)
dists=[]
for it in prd:
    dists.append([np.linalg.norm(it[0]-it[1]),it[0],it[1]])
sorted(dists, key=lambda x: x[0])

Result:

U

U235U235, 2021-04-05
@U235U235

scipy has scipy.spatial.distance.pdist¶
I think this is what you need.
Correction: cdist, of course.