How to find the most similar face from the database?

B

Bleno2021-07-31 09:01:17

MongoDB

Bleno, 2021-07-31 09:01:17

There is a database in which each person is encoded with 128 numbers. For example:

Face 1

[ -0.1130942776799202,
     0.1614537537097931,
     -0.006035998929291964,
     -0.1542104184627533,
     -0.06875834614038467,
     -0.032062552869319916,
     0.05085310339927673,
     -0.0772537812590599,
     0.20312847197055817,
     -0.03393673151731491,
     0.14966045320034027,
     0.015613612718880177,
     -0.2537919580936432,
     -0.055424273014068604,
     -0.0216793492436409,
     0.09997572004795074,
     -0.10322976112365723,
     -0.11479102075099945,
     -0.13069766759872437,
     -0.02312583476305008,
     0.052273452281951904,
     0.03703169524669647,
     -0.0002829816658049822,
     -0.0005055302754044533,
     -0.04392200708389282,
     -0.2537318468093872,
     -0.08467373996973038,
     -0.07383286952972412,
     -0.013739113695919514,
     -0.0824621170759201,
     0.05842585861682892,
     -0.02282029017806053,
     -0.29698559641838074,
     -0.15543779730796814,
     -0.012977054342627525,
     0.05585605651140213,
     -0.18177944421768188,
     -0.08584177494049072,
     0.1914125382900238,
     0.049033816903829575,
     -0.12426487356424332,
     0.07370023429393768,
     0.10151664912700653,
     0.2125469595193863,
     0.15979231894016266,
     0.004027041606605053,
     0.052840933203697205,
     -0.11493529379367828,
     0.10623033344745636,
     -0.24218681454658508,
     0.0558830089867115,
     0.11607223749160767,
     0.18381895124912262,
     0.1480993628501892,
     0.0934639647603035,
     -0.17770831286907196,
     0.09219089150428772,
     0.09511983394622803,
     -0.2078547477722168,
     0.11334264278411865,
     0.0864596962928772,
     -0.03732181340456009,
     -0.06170603632926941,
     0.014513085596263409,
     0.19028794765472412,
     0.08083567023277283,
     -0.07796188443899155,
     -0.13692373037338257,
     0.1674240529537201,
     -0.10188353061676025,
     -0.023299138993024826,
     0.13179853558540344,
     -0.06638269871473312,
     -0.2005026787519455,
     -0.1835901290178299,
     0.13835741579532623,
     0.3288293480873108,
     0.16999228298664093,
     -0.1326129138469696,
     -0.012811451219022274,
     -0.12992361187934875,
     -0.027557959780097008,
     -0.027281606569886208,
     0.020332694053649902,
     -0.08542351424694061,
     -0.018185008317232132,
     -0.09051108360290527,
     0.0608053132891655,
     0.20289434492588043,
     -0.032099973410367966,
     -0.04639098420739174,
     0.23962445557117462,
     0.07255730032920837,
     0.004598921164870262,
     -0.010999824851751328,
     0.023231782019138336,
     -0.11400877684354782,
     -0.0024917502887547016,
     -0.12924475967884064,
     -0.034801140427589417,
     0.07540622353553772,
     -0.14184702932834625,
     -0.03451569378376007,
     0.10676993429660797,
     -0.14829140901565552,
     0.14299613237380981,
     -0.016563797369599342,
     -0.059809811413288116,
     -0.09886705130338669,
     -0.021711565554142,
     -0.05889463424682617,
     0.01345861703157425,
     0.08860933780670166,
     -0.19535264372825623,
     0.30562612414360046,
     0.14268618822097778,
     -0.034439895302057266,
     0.09777306765317917,
     0.08058249950408936,
     0.0537114292383194,
     0.04592563211917877,
     -0.03586505353450775,
     -0.11147730052471161,
     -0.12388268858194351,
     0.0529785230755806,
     0.08037040382623672,
     0.08579140156507492,
     0.0404975451529026 ]

There can be a lot of records in the database (at the moment, about 150 thousand, but it is planned to scale up to tens and hundreds of millions).
The essence of the problem is that we are given a new face as an input, its points are calculated (an array of 128 numbers), and we need to find the closest array from the base, in other words, we need to find a point with a minimum Euclidean distance.
The problem is that it takes a very long time to calculate the Euclidean distance for the entire base every time a new face comes in. And another specific feature of this task is that indexes will not help here in any way.
I tried kd-trees, they have O(log n) complexity, but only for dimensions 2.3, and for dimensions higher (128, as in this case), this type of tree works well, very slowly. I tried sorting by coordinate, by distance to the center of coordinates, it takes too long. I also tried the module for python sklearn, namely sklearn.KNeighborsClassifier, also slowly.

To understand what is slow and what is not: I rely on the ability of the program to find the 10 most similar faces from a database of a billion records in 1 second or slower. If you have any ideas on how to implement this task, I would be grateful to hear from you.

PS I think optimizations, such as division into gender and age, are not the best option here, since such algorithms are often mistaken, which in the end can make it impossible to search for a face in the database, given gender and age. In addition, such optimization will not give a strong increase in speed.

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

V

Vindicar, 2021-07-31
@Bleno

How about multi-tier clustering? Those. break the database of faces into clusters of similar ones, then divide these clusters into sub-clusters, and so on. As a result, build a tree, where each tier describes a certain division of persons into subgroups (not necessarily semantic such as male/female). You start from the top tier, having found the most similar node/cluster, perform a second search, but by its immediate descendants. And so, until you narrow the search range to an acceptable one, when it will be possible to run the search over all the faces in this cluster.
Tree parameters (size and number of tiers) will have to be selected independently.
There are possible modifications. First, you can turn a tree into a directed graph by giving each node more than one parent. This can reduce the classification error.
Secondly, you can try to train a classifier (for example, SVM) so that it reports the number of the cluster to which this person most likely belongs, and then search inside this cluster.

U

U235U235, 2021-07-31
@U235U235

Plot a Voronoi diagram for a set of points.
see https://habr.com/en/post/309252/

V

Vladimir Kuts, 2021-07-31
@fox_12

Using Elasticsearch
https://habr.com/ru/company/otus/blog/557210/

T

ThunderCat, 2021-07-31
@ThunderCat

Perhaps it will be easier if you make 2 or 3 basic parameters that will definitely exclude similarity (for example, skin color, eye shape, chin shape, etc.), and / or, for example, add a field with reduced encoding quality to 64-32 in order to reduce sampling, and to do already on the basis of the remainder of the sampling enumeration.