A
A
Al2016-07-12 20:04:57
Redis
Al, 2016-07-12 20:04:57

How to properly scale a nodejs application using cluster?

For starters, here's my crooked start script (crooked because it doesn't work right), I tried to comment everything in it to make it clearer for you:

#!/usr/bin/env node
const log = require("lib/log")(module);  
require('globals');   //подключаем глобальные переменные global.****
const DatabaseController = require("lib/mysql");   //конструктор mysql запросов
const LocalizationController = require("localizationController"); //Загружает данные с локализацией из mysql в глобальные переменные
const CronJob = require('cron').CronJob;
const http = require('http');
const cluster = require('cluster');
const numCPUs = require('os').cpus().length;

var app = require('config/application');  //подключаем express со всеми его настройками (мидлвары, view-engine, роуты и др)
app.set('port', _G.cfg.port);  //_G - глобальная переменная с конфигами и проч
var server = http.createServer(app);
var io = require('base/socket')(server); //подключаем socket.io, в 'base/socket' находятся основные события connect/disconnect, авторизация сокета и др.
var redis = require('socket.io-redis');
io.adapter(redis({ host: 'localhost', port: 6379 }));

if (cluster.isMaster) {
  var i = 0;  //поскольку мы в мастере, стартуем тестовую кронджобу, предпологаем что они будут выполняться только в мастере
  let job = new CronJob('* * * * * *', function() {
    log.info(`You will see this message every second: ${i}`);
    i++;
  }, null);
  job.start();

  for (var i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    log.info(`worker ${worker.process.pid} died`);
  });
} else {
  // для воркеров
  server.listen(_G.cfg.port,()=>{  //запускаем сервер на нашем порту из конфига
    LocalizationController.refreshLocales(() =>{  //загружаем локализации из бд в глоб переменную
      let cc = DatabaseController.cacheController;  // DatabaseController.cacheController - экземпляр для работы с redis
      cc.getByMask(`${_G.cfg.session.matchingID}:*`,(err,data) =>{ //из redis загружаем все сессии
        for(let i in data){
          _G.sess[i.split(':').pop()] = data[i]; //кладем в глобальную переменную соответствие id-юзера и session-id
                                                  //их мы потом будем использовать для отправки данных нужному сокету
        }
        app.set('io', io);
        log.info(`App listen port: "${server.address().port}"`);
      });
    });
  });

  server.on('error', (error) =>{
    //error handler
  });
}

Here, I have two obvious problems, the first is the problem of sockets, they behave ugly, for example, a socket with the same id can be authorized several times with one request, I did not understand how to deal with this in order to force work correctly - just like without cluster.
Second, and a very important issue is global variables. I store localization and correspondence of id-sessions there - the id of the authorized user, and perhaps they will be used for something else. The problem here is that all processes have their own global variables, but you need to somehow generalize them, if one process changes something in global variables, then this change should be duplicated for all processes. I don’t understand how to do this, the only thing that comes to mind is to store this data in redis and take it from there if necessary, but how much will this affect the performance of the application, because taking from a global variable is easier than making a request to redis.
This is a very confusing task for me (especially with sockets), I don't understand where I'm making mistakes or what I'm doing wrong at all. I hope for your help...

Answer the question

In order to leave comments, you need to log in

2 answer(s)
X
xfg, 2016-07-12
@Sanu0074

The problem here is that all processes have their own global variables, but you need to somehow generalize them, if one process changes something in global variables, then this change should be duplicated for all processes.

Horizontal scaling primarily involves the ability to deploy an application on N number of physically remote computers. Accordingly, you cannot change something locally in such a way that it is duplicated for all processes on all remote machines. Alas, magic does not exist. To make this possible, it is required to implement some kind of inter- process communication mechanism . You have already named one of them - to use redis for this purpose.
But your problem is different. In architecture. She is not. The code above is spaghetti code . That's why you think you need it
if one process changes something in global variables, then this change should be duplicated
In fact, if you refactor the code, it may very soon become clear that you didn’t really need it, and it seems like global variables weren’t really needed.
In general, the presence of a large number of global variables is one of the indicators of bad code.
You can continue to fence crutches and accumulate technical debt by maintaining the global state in redis or passing data from child processes to the master process and vice versa (thankfully node.js has tools for this), but it’s still better to start learning something from the field of architectural solutions on the web (and in javascript in particular) and look at the code of other large projects and preferably not only in javascript.
Think architecture. Start by looking at how to avoid the need for this wild manipulation of global variables.

P
Pavel Shershnev, 2016-07-21
@PvUtrix

Everything can be done much easier, look towards PM2
pm2.keymetrics.io/docs/usage/cluster-mode

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question