How to select unique values in a database by comparing two arrays?

H

hbrmdc2015-12-10 12:30:51

SQL

hbrmdc, 2015-12-10 12:30:51

The database has a "Contacts" table with ~100,000 rows.
Users 1 and 2 each own 1000 rows (contacts) from this table.
Approximately 30% of the contacts of these users are the same (that is, they are the same rows from the "Contacts" table).
It is necessary to make sure that user #1 receives a list of those contacts of user #2 that user #1 does not have.
User #1 can quickly add contacts to user #2, after which the list of non-matching contacts needs to be updated.
What database to use for this and how to implement it correctly?
Nothing but simple regular functions comes to my mind. But here's the problem, if these users have 10,000 contacts and 90% of them match, then the regular function will already be heavy. And if you add a list of users who have this contact to each contact, then there will be problems when 1000+ users add this contact, since you will have to request the full list of users who have this contact each time, add another user and then save all this .
Haven't worked with NoSQL, but if that's what I need, then I'll look into it.

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

R

romy4, 2015-12-10
@hbrmdc

solved for example in postgres with an external join
select *
from contacts c1
outer join contacts c2 on c1.id=c2.id
where c1.user_id=x and c2.user_id=y

L

lega, 2015-12-10
@lega

It is necessary to make sure that user #1 receives a list of those contacts of user #2 that user #1 does not have.

In mongo you can use $ne and $nin
Example:

> db.x.insert({ name: 'linux', user_id: [1] })
> db.x.insert({ name: 'unix', user_id: [2] })
> db.x.insert({ name: 'ubuntu', user_id: [1, 2] })
> db.x.insert({ name: 'windows', user_id: [3] })
> db.x.ensureIndex({user_id: 1})   //  Делаем индекс

The query itself: all documents of user 2, but where there is no user 1

> db.x.find({ user_id: { $eq: 2, $ne: 1 }})
{ "_id" : ObjectId("56695f1e9349d7e6c71d83f1"), "name" : "unix", "user_id" : [ 2 ] }

D

Dmitry Belyaev, 2015-12-10
@bingo347

The simplest implementation will be just the same in sql:

SELECT id, cotact_id FROM contacts
WHERE user_id = 2
AND id NOT IN (SELECT id FROM contacts
WHERE user_id = 1)

N

nozzy, 2015-12-10
@nozzy

select 
contact_id
from contacts
where 1=1
and user_id = 1
and contact_id not in (
  select 
  distinct contact_id
  from contacts
  where user_id = 2
)

A

Alexander Evseev, 2015-12-17
@alex1t

In T-SQL (MS SQL) it is possible through operations on sets:

WITH User1Contacts AS (
    SELECT id as cotact_id FROM contacts  WHERE user_id = 1
),
User2Contacts AS (
    SELECT id as cotact_id FROM contacts  WHERE user_id = 2
)
SELECT * FROM User1Contacts 
EXCEPT
SELECT * FROM User2Contacts

INTERSECT, on the contrary, will select only common contacts
UNION - union of contacts of both users (without duplicates)
I.e. in terms of set operations
INTERSECT = A ∩ B
UNION = AUB
EXCEPT = A \ B