W
W
WinconeCoder2020-12-06 17:27:13
Python
WinconeCoder, 2020-12-06 17:27:13

How to search by cosine proximity?

What the question is, we have a base with vectors that are the result of the work of AI, which creates unique parameters for each face.

These parameters are stored in the database.

Then we take another photo, get these parameters and perform a database search.

We need to find the closest row in terms of values.
As I understand (already from Habr), that this is done through cosine proximity and shows similarity in%.

I saw there is a full-text search, it works in much the same way, only with text, I would like to search the entire array at once and look for a similar one.

I know that arrays can be stored in MongoDB, but I don’t know how to do what I described above.

You may not even need MongoDB, but I would still like to store all these parameters in one place for quick search.

Face 1 parameters:

[-0.09467582  0.01034473 -0.02064635 -0.0627897  -0.10621011 -0.02298409
  0.06024306 -0.07245292  0.19958037 -0.08510211  0.15435484 -0.02906684
 -0.25881869 -0.09997653  0.04966331  0.15049647 -0.18467687 -0.17127053
 -0.07990446 -0.04140811  0.05777079 -0.02871657 -0.07008     0.15270819
 -0.12112831 -0.34181389 -0.05736313 -0.08841895  0.02192516 -0.08674139
 -0.00225462  0.07204872 -0.18367235 -0.04173218  0.03215467  0.0609054
 -0.08098375 -0.06814954  0.19135296  0.02411688 -0.23507144 -0.03319901
  0.11054757  0.22216964  0.18632102  0.0070124  -0.02181121 -0.0435528
  0.11866748 -0.31335333  0.01463403  0.15883729  0.14864783  0.13290378
  0.08808084 -0.2080095   0.01723186  0.07500338 -0.28062934  0.03903538
 -0.05020006 -0.10300423 -0.04434941 -0.02988065  0.15134467  0.14888936
 -0.10457852 -0.14037935  0.11823665 -0.23838541 -0.07100967  0.07097597
 -0.06376494 -0.23478457 -0.26558936  0.02815399  0.37244275  0.20168875
 -0.20747432  0.01553329 -0.01438614 -0.08818501  0.07332679  0.08132945
 -0.05615982 -0.00572415 -0.05834332  0.06225963  0.1342393  -0.01651243
  0.02042192  0.22114027  0.03347541  0.0252452   0.02661949  0.05087201
 -0.0786478   0.03728545 -0.147938   -0.04321887  0.04343022 -0.03822023
  0.06526193  0.02413747 -0.19151178  0.18569151  0.05712438 -0.09302571
 -0.07091277  0.00447916 -0.11529445  0.02216976  0.10242402 -0.27299124
  0.2679556   0.1816932   0.00563922  0.16964656  0.00161655  0.00382699
  0.01180157 -0.11163969 -0.0791599  -0.05371575  0.08083108 -0.03230736
  0.0563346  -0.02492868]


Face parameters 2:
[-0.10842698  0.06751744 -0.03576533 -0.02410238 -0.10450766 -0.00671807
 -0.01094658 -0.15583992  0.17468813 -0.08529656  0.26009229  0.01679563
 -0.27459297 -0.08906213  0.0094724   0.14103106 -0.15762219 -0.1020133
 -0.1185605  -0.04197036  0.0505495   0.0710125  -0.01071301  0.09425637
 -0.03537129 -0.33294344 -0.07992638 -0.05325703  0.04313097 -0.08847439
  0.0691345   0.02394961 -0.16866098 -0.05316673  0.04040826  0.05743459
 -0.13919455 -0.04760994  0.2300759   0.00740959 -0.15638578 -0.05217468
  0.05404993  0.21545541  0.14900987  0.07806974  0.04216603 -0.10764394
  0.0685627  -0.2875551   0.00671652  0.1457226   0.10842534  0.13295417
  0.00788296 -0.23921658 -0.01576398  0.12401786 -0.26180816  0.06234765
 -0.00872383 -0.10830986 -0.02049002 -0.00291615  0.23768103  0.12402728
 -0.13215128 -0.15710594  0.13069825 -0.25391164 -0.09819756  0.11684453
 -0.0602451  -0.25244108 -0.28748673 -0.01146378  0.43858832  0.12676194
 -0.14056295 -0.04472011 -0.05969332 -0.0976763   0.0461247   0.04969415
 -0.03701543 -0.06738835 -0.06807589  0.00916691  0.23380761 -0.0117625
 -0.04339299  0.27221215  0.05340466  0.06402096  0.01888075  0.0644009
 -0.00229403  0.00480994 -0.06849871  0.00507571 -0.0647999  -0.06304722
  0.06471531  0.00982873 -0.14988063  0.25307804  0.03390499 -0.01774084
 -0.05760072 -0.01716787 -0.1525414   0.00642409  0.19820943 -0.31351233
  0.29484281  0.19840276  0.0811023   0.21227948  0.01173819  0.07260138
 -0.00810404 -0.09367184 -0.14486414 -0.07504445  0.02670753 -0.04982813
  0.08031661  0.02423495]


This is exactly what you need to save and look for in the database.

Program code:
import face_recognition

image = face_recognition.load_image_file("фотография1.jpg")
face_encoding = face_recognition.face_encodings(image)[0]

image1 = face_recognition.load_image_file("фотография2.jpg")
face_encoding1 = face_recognition.face_encodings(image1)[0]

face_distances = list(
    1 - face_recognition.face_distance([face_encoding], face_encoding1)
)

print(face_distances)


face_encoding stores the very arrays that I described above, and then compares and says how similar people are in 2 photos.

This is certainly good, but it would not be very logical to go through each line in a database with 50,000 lines.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
W
WinconeCoder, 2020-12-06
@WinconeCoder

As far as I understand, you need to look towards the Euclidean distance, but what to do with monga is not entirely clear

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question