I
I
Imp2019-08-18 02:27:15
JavaScript
Imp, 2019-08-18 02:27:15

How to quickly compare two arrays?

There are two arrays.
1. Users 2. Users who have already been processed before. These arrays can have hundreds of thousands of elements. They are loaded from a text file like this:
global.users = [111,333,444,666,777]
global.blacklist = [111,222,333,555,888]

const users = Files.readFileSync('./users.txt', 'utf8').split(/\r?\n/);
const blacklist = Files.readFileSync('./blacklist.txt', 'utf8').split(/\r?\n/);

I need to compare arrays and leave only unique values:
global.users = global.users.filter(user => !global.blacklist.find(b => user == b));

It got to the point that it compares these arrays for 10 minutes.
Can I speed it up somehow? Maybe at the stage of reading these two files. These files are loaded and the arrays are compared when the application starts.

Answer the question

In order to leave comments, you need to log in

4 answer(s)
C
coderisimo, 2019-08-18
@coderisimo

Or maybe compare them BEFORE, even on the server (generate one text file in advance)? Or use the database, immediately getting the desired selection? That is, there is a 'processed' field in the database, which can be 1 or not 1 :)))))))) .
IMHO, "hundreds of thousands of elements" sucks to combine with "loaded from a text file" and "compare at application startup". It's like "you need to exceed the speed of sound" and "using a scooter for a child of 3 years"

I
Ilya Serov, 2019-08-18
@JavaIlya

https://developer.mozilla.org/en/docs/Web/JavaScript...

A
Alexey, 2019-08-18
@Azperin

Are you sure it takes time to compare? I would bet on .split, because it seems to me that 100k+ lines to split is more expensive than just sorting through the array. Well, replace find with a banal indexOf

const blacklist = Files.readFileSync('./blacklist.txt', 'utf8').split(/\r?\n/);
const users = Files.readFileSync('./users.txt', 'utf8').split(/\r?\n/).filter((user) => {
  var idx = blacklist.indexOf(user);
  if (idx === -1) {
    return true;
  } else {
    blacklist.splice(idx, 1); // тут я не уверен в профите уменьшения блеклиста, потому что это надо будет перестраивать индексы, так что можно просто сделать return blacklist.indexOf(user) === -1;
    return false;
  };	
});

L
longclaps, 2019-08-18
@longclaps

Kind of like a merge. Perhaps this method will be faster than with Set() , and less greedy for memory:

let users = [111, 333, 444, 666, 777], // важно, что массивы отсортированы
    blacklist = [111, 222, 333, 555, 888],
    intersection = [];
users.unshift(-1); // -1 заведомо меньше всех
blacklist.unshift(-1);
for (let u = users.pop(), b = blacklist.pop(); u > -1 || b > -1;) {
    if (u > b) u = users.pop();
    else if (u < b) b = blacklist.pop();
    else {
        intersection.push(u);
        u = users.pop();
        b = blacklist.pop();
    }
}
intersection.reverse(); // если это вообще надо
console.log(intersection);

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question