Answer the question
In order to leave comments, you need to log in
How to find / identify similar pictures?
There are more than 150,000 photos on the site, duplicates periodically appear - an identical photo, a photo with a resize, a photo with a crop, a photo with a changed color.
I want to filter such duplicates, I’m even ready to connect individual capacities for this, but the problem is that I don’t know where to start, where to dig, maybe there are some ready-made solutions for Node.js, or even Node.js + MongoDB where some data for faster finding of duplicates???
Answer the question
In order to leave comments, you need to log in
As answered above, today there are already a lot of ready-made bikes for this.
Previously, this was done, for example, like this:
1) When loading, the image is converted to grayscale (discolored) and reduced in size (for example, to 10-15 pixels on the longest side)
2) The resulting small image is converted into a hash
3) A search is performed on the database of previously loaded images this hash
4a) If there are no matches, the download is confirmed, a new entry is made in the database
4b) If matches are found, an answer is given that such an image already exists
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question