S
S
Stanislav2018-07-28 18:46:24
MongoDB
Stanislav, 2018-07-28 18:46:24

How to overcome, MongoDB 4.0 eats all the memory and crashes, what can you think of?

When mongodb 3.6 and node.js 9.x were installed, everything was great, everything worked for 3-6 months and there were no problems. Recently I decided to update everything and problems started: The site is hosted on VPS debian 9, 4GB of RAM, 2 cores. After the update, a day later, the site goes down, it turned out that mongod had fallen, because it ate all the memory. After I increased the RAM to 8GB, everything worked successfully for 2 days, then again the same thing, I have to restart the server.
Node.js - v10.7.0 MongoDB - 4.0.0
Mongod.conf

storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true
#  engine:
#  mmapv1:
  wiredTiger:
    engineConfig:
       cacheSizeGB: 6

cacheSizeGB added after and did not help
Syslog
Jul 28 08:56:54 SiteName dhclient[674]: DHCPREQUEST of xx.xx.xxx.186 on eth0 to xxx.xx.1.1 port 67
Jul 28 08:56:54 SiteName dhclient[674]: DHCPACK of xx.xx.xxx.186 from 172.31.1.1
Jul 28 08:56:54 SiteName dhclient[674]: bound to xx.xx.xxx.186 -- renewal in 36837 seconds.
Jul 28 09:00:01 SiteName CRON[16647]: (root) CMD (/usr/bin/node /var/www/SiteName.ru/bin/cron/synonyms.js >/dev/null 2>&1)
Jul 28 09:00:05 SiteName kernel: [331671.923612] node invoked oom-killer: gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0, order=0, oom_score_adj=0
Jul 28 09:00:05 SiteName kernel: [331671.926532] node cpuset=/ mems_allowed=0
Jul 28 09:00:05 SiteName kernel: [331671.927797] CPU: 1 PID: 16649 Comm: node Not tainted 4.9.0-7-amd64 #1 Debian 4.9.110-1
Jul 28 09:00:05 SiteName kernel: [331671.930045] Hardware name: Hetzner vServer, BIOS 20171111 11/11/2017
Jul 28 09:00:05 SiteName kernel: [331671.931869]  0000000000000000 ffffffff97530644 ffffa70f811ffcf0 ffff8c66a600a800
Jul 28 09:00:05 SiteName kernel: [331671.933872]  ffffffff97403b30 0000000000000000 0000000000000000 0000000c9754f4c9
Jul 28 09:00:05 SiteName kernel: [331671.936081]  ffff8c66a8f5c0c0 ffffffff973892d7 ffffffffc02d86a5 00000000ffffffff
Jul 28 09:00:05 SiteName kernel: [331671.938199] Call Trace:
Jul 28 09:00:05 SiteName kernel: [331671.939114]  [<ffffffff97530644>] ? dump_stack+0x5c/0x78
Jul 28 09:00:05 SiteName kernel: [331671.964188]  free:25595 free_pcp:30 free_cma:0
Jul 28 09:00:05 SiteName kernel: [331671.972611] Node 0 active_anon:7716024kB inactive_anon:82108kB active_file:92kB inactive_file:160kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:4868kB dirty:92kB writeback:4kB shmem:82720kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
Jul 28 09:00:05 SiteName kernel: [331672.019758] Node 0 DMA32: 216*4kB (UME) 192*8kB (UME) 169*16kB (UME) 187*32kB (UME) 73*64kB (UME) 34*128kB (UME) 17*256kB (UE) 10*512kB (UME) 1*1024kB (E) 3*2048kB (UE) 2*4096kB (M) = 44944kB
Jul 28 09:00:05 SiteName kernel: [331672.027118] Node 0 Normal: 873*4kB (UME) 618*8kB (UE) 367*16kB (UME) 260*32kB (UME) 140*64kB (UME) 46*128kB (UME) 14*256kB (UME) 1*512kB (E) 0*1024kB 0*2048kB 0*4096kB = 41572kB
Jul 28 09:00:05 SiteName kernel: [331672.033850] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Jul 28 09:00:05 SiteName kernel: [331672.037183] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jul 28 09:00:05 SiteName kernel: [331672.040524] 20785 total pagecache pages
Jul 28 09:00:05 SiteName kernel: [331672.042589] 0 pages in swap cache
Jul 28 09:00:05 SiteName kernel: [331672.044689] Swap cache stats: add 0, delete 0, find 0/0
Jul 28 09:00:05 SiteName kernel: [331672.047591] Free swap  = 0kB
Jul 28 09:00:05 SiteName kernel: [331672.049456] Total swap = 0kB
Jul 28 09:00:05 SiteName kernel: [331672.051438] 2047866 pages RAM
Jul 28 09:00:05 SiteName kernel: [331672.053354] 0 pages HighMem/MovableOnly
Jul 28 09:00:05 SiteName kernel: [331672.055177] 51623 pages reserved
Jul 28 09:00:05 SiteName kernel: [331672.056987] 0 pages hwpoisoned
Jul 28 09:00:05 SiteName kernel: [331672.059058] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
v2.10.4: Go
Jul 28 09:00:05 SiteName kernel: [331672.125398] [ 7026]   111  7026  2046636  1763814    3577      10        0             0 mongod
Jul 28 09:00:05 SiteName kernel: [331672.128446] [ 9043]     0  9043   326382    24448     380     166        0             0 node /var/www/f
Jul 28 09:00:05 SiteName kernel: [331672.165206] [16646]     0 16646   243203    21939     141      77        0             0 node
Jul 28 09:00:05 SiteName kernel: [331672.184308] Out of memory: Kill process 7026 (mongod) score 885 or sacrifice child
Jul 28 09:00:05 SiteName kernel: [331672.187084] Killed process 7026 (mongod) total-vm:8186544kB, anon-rss:7055256kB, file-rss:0kB, shmem-rss:0kB
Jul 28 09:00:06 SiteName systemd[1]: mongod.service: Main process exited, code=killed, status=9/KILL
Jul 28 09:00:06 SiteName systemd[1]: mongod.service: Unit entered failed state.
Jul 28 09:00:06 SiteName systemd[1]: mongod.service: Failed with result 'signal'.

Mongod.log
До этого  лог обычных запросов и ничего более

2018-07-28T09:49:08.113+0300 I CONTROL  [main] ***** SERVER RESTARTED *****
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten] MongoDB starting : pid=826 port=27017 dbpath=/var/lib/mongodb 64-bit host=SiteName
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten] db version v4.0.0
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten] git version: 3b07af3d4f471ae89e8186d33bbb1d5259597d51
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.1.0f  25 May 2017
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten] allocator: tcmalloc
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten] modules: none
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten] build environment:
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten]     distmod: debian92
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten]     distarch: x86_64
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten]     target_arch: x86_64
2018-07-28T09:49:08.323+0300 I CONTROL  [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "127.0.0.1", port: 27017 }, processManagement: { timeZoneInfo: "/usr/share/zoneinfo" }, storage: { dbPath: "/var/lib/mongodb", journal: { enabled: true }, wiredTiger: { engineConfig: { cacheSizeGB: 6.0 } } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log" } }
2018-07-28T09:49:08.323+0300 W STORAGE  [initandlisten] Detected unclean shutdown - /var/lib/mongodb/mongod.lock is not empty.
2018-07-28T09:49:08.324+0300 I STORAGE  [initandlisten] Detected data files in /var/lib/mongodb created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2018-07-28T09:49:08.324+0300 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2018-07-28T09:49:08.324+0300 I STORAGE  [initandlisten] 
2018-07-28T09:49:08.324+0300 I STORAGE  [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2018-07-28T09:49:08.324+0300 I STORAGE  [initandlisten] **          See http://dochub.mongodb.org/core/prodnotes-filesystem
2018-07-28T09:49:08.324+0300 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=6144M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2018-07-28T09:49:10.581+0300 I STORAGE  [initandlisten] WiredTiger message [1532760550:581633][826:0x7f99c56a99c0], txn-recover: Main recovery loop: starting at 794/47914112
2018-07-28T09:49:10.581+0300 I STORAGE  [initandlisten] WiredTiger message [1532760550:581954][826:0x7f99c56a99c0], txn-recover: Recovering log 794 through 795
2018-07-28T09:49:10.884+0300 I STORAGE  [initandlisten] WiredTiger message [1532760550:884388][826:0x7f99c56a99c0], file:sizeStorer.wt, txn-recover: Recovering log 795 through 795
2018-07-28T09:49:11.015+0300 I STORAGE  [initandlisten] WiredTiger message [1532760551:15756][826:0x7f99c56a99c0], file:sizeStorer.wt, txn-recover: Set global recovery timestamp: 0
2018-07-28T09:49:11.127+0300 I RECOVERY [initandlisten] WiredTiger recoveryTimestamp. Ts: Timestamp(0, 0)
2018-07-28T09:49:11.295+0300 I CONTROL  [initandlisten] 
2018-07-28T09:49:11.295+0300 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2018-07-28T09:49:11.295+0300 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2018-07-28T09:49:11.295+0300 I CONTROL  [initandlisten] 
2018-07-28T09:49:11.508+0300 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/var/lib/mongodb/diagnostic.data'
2018-07-28T09:49:11.514+0300 I NETWORK  [initandlisten] waiting for connections on port 27017
2018-07-28T09:49:11.905+0300 I CONTROL  [free_mon] Free Monitoring is Enabled. Frequency: 60 seconds
2018-07-28T09:49:12.015+0300 I FTDC     [ftdc] Unclean full-time diagnostic data capture shutdown detected, found interim file, some metrics may have been lost. OK
2018-07-28T09:49:12.573+0300 I NETWORK  [listener] connection accepted from 127.0.0.1:52684 #1 (1 connection now open)
2018-07-28T09:49:12.580+0300 I NETWORK  [conn1] received client metadata from 127.0.0.1:52684 conn1: { driver: { name: "nodejs", version: "3.1.1" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.9.0-7-amd64" }, platform: "Node.js v10.7.0, LE, mongodb-core: 3.1.0" }
2018-07-28T09:49:12.601+0300 I NETWORK  [listener] connection accepted from 127.0.0.1:52688 #2 (2 connections now open)

Mongostat (after restarting the server after half an hour)
1Sr7k.png
Mongostat shows how
Top memory is slowly growing (after restarting the server after an hour)
U9Cwt.png

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
Stanislav, 2018-08-23
@ms-dred

And so, apparently it was really in mongodb, most likely there was a leak in it.
Today I updated to 4.0.1, turned on top and watched.
I didn’t see the moment of freeing the memory, but mongod used about 45% of the memory, when I looked at the console again an hour later, it was already 33% of the memory, there was a reset. Opened the mongodb log, there were no reboots.
I’ll also observe of course, but it seems that it either got crooked, or the error was originally in mongodb
mongostat also shows that mongo consumes less memory, not by much, but less.

D
Dmitry Makarov, 2018-09-20
@MDiMaI666

I would recommend postgres with sequalizejs, it is better in every way. On my hardware, one database fell once a month, and with postgres, several copies of the application are already calmly holding.

K
KaterynaA, 2019-02-01
@KaterynaA

Hello!
How many collections do you have?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question