S
S
Stanislav2018-07-09 21:09:17
Node.js
Stanislav, 2018-07-09 21:09:17

How to find / identify similar pictures?

There are more than 150,000 photos on the site, duplicates periodically appear - an identical photo, a photo with a resize, a photo with a crop, a photo with a changed color.
I want to filter such duplicates, I’m even ready to connect individual capacities for this, but the problem is that I don’t know where to start, where to dig, maybe there are some ready-made solutions for Node.js, or even Node.js + MongoDB where some data for faster finding of duplicates???

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
sim3x, 2018-07-09
@sim3x

https://www.google.com.ua/search?q=npm+image+similarity

A
Anubis, 2018-07-10
@Anubis

As answered above, today there are already a lot of ready-made bikes for this.
Previously, this was done, for example, like this:
1) When loading, the image is converted to grayscale (discolored) and reduced in size (for example, to 10-15 pixels on the longest side)
2) The resulting small image is converted into a hash
3) A search is performed on the database of previously loaded images this hash
4a) If there are no matches, the download is confirmed, a new entry is made in the database
4b) If matches are found, an answer is given that such an image already exists

R
RidgeA, 2018-07-09
@RidgeA

www.phash.org
https://www.npmjs.com/package/phash-image

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question