3
3
3ton2020-12-10 14:31:43
ruby
3ton, 2020-12-10 14:31:43

What is the sequence for finding memory leaks in Ruby?

There is an application written not by me and before me.
It is responsible for the business logic of telephony, as well as for providing an API to use this business logic.

The application itself uses the Adhearsion libraries (to communicate with Asterisk) and Sinatra for the API

At some point, the virtual machine hung so much that even via SSH it was not possible to connect to it and I had to do a Hard Reset.

During the proceedings, a problematic place was identified - an application in Ruby.
At the time of launch, it ate 100 meters of RAM, after a couple of weeks it already ate more than a giga of RAM.

I would like to understand how memory leaks are searched, or not closed / deleted objects in memory

UPD: 07-22-2021

today the situation is as follows, I wrote my own logger and try to throw data into the logs at critical moments for me,

in fact, I throw 2 (sometimes 3) parameters
1 - the class and method from which the logger is dumped
2 - a string for the logger
3 - an optional attribute - a separate the log file

discarded 2 loggings that are not particularly important for me in separate files, this is a delayed job library for launching tasks in ruby ​​on a schedule (I use it to start calls in an asterisk at the right time) and separately I throw a log on sinatra (calls to methods and attributes)

now mostly the log collects data on session starts, their processing, as well as billing,

the line in the log takes the following form
Time :: process virtual memory|process RAM :: data from GC(malloc_increase_bytes-malloc_increase_bytes_limit-total_allocated_objects) :: Class->log call method :: message string to log


2021-07-22 09:34:19 +0200 :: 4545776|172832 :: 14869832-16777216-14064439 :: AdviserCall-> :: 17715: Start to call to adviser
2021-07-22 09:34:22 +0200 :: 4545776|172940 :: 16298832-16777216-14074552 :: Direct->run :: 17715: on end call
2021-07-22 09:34:22 +0200 :: 4545776|171628 :: 433496-16777216-14078617: : Direct->run :: 17715: finished call: hungup (16)
2021-07-22 09:34:22 +0200 :: 4545776|171676 :: 1625480-16777216-14081883 :: Direct->run :: 17715: session.update_attributes: stop 2021-07-22 09:34:22 +0200, duration 31.0
2021-07-22 09:34:22 +0200 :: 4545776|171676 :: 2307360-16777216-14090011 :: Direct->run :: 17715: on customer.calculate_duration
2021-07-22 09:34:22 +0200 :: 4545776|171676 :: 2391720-16777216-14092322 :: AdviserCall-> :: : End callcontroller
2021-07-22 09:34:22 +0200 :: 4545776|171676 :: 2568720-16777216-14092986 :: AdviserCall-> :: 17715: after call to adviser
2021-07-22 09:34:22 +0200 : : 4545776|171684 :: 3171688-16777216-14094233 :: Direct->run :: 17715: on adv.calculate_duration
2021-07-22 09:34:22 +0200 :: 4545776|171684 :: 3224400-16777216 : Direct->run :: 17715: unlock adviser
2021-07-22 09:34:22 +0200 :: 4545776|171704 :: 3962392-16777216-14101560 :: Direct->run :: 17715: on session.calculate_billing
2021 -07-22 09:34:22 +0200 :: 4545776|171732 :: 4271248-16777216-14110094 :: Direct->run :: 17715: before EXECUTED status was: EXECUTING
2021-07-22 09:34:22 + 0200::4545776|171756::6101592-16777216-14120962::Direct->::7cf84f4b-7178-4c55-b6bd-3aa42019c6b6: End callcontroller
2021-07-22 09:34:22 +0200 :: 4545776|171812 :: 7845656-16777216-14122863 :: AdviserCall->run :: 17715: call.on_end: hangup-command()
2021-07-22 09: 34:22 +0200 :: 4545776|171836 :: 8181216-16777216-14124630 :: AdviserCall->run :: 17715: before calculate duration
2021-07-22 09:34:22 +0200 :: 4545776|171852 :: 8196080 -16777216-14124849 :: AdviserCall->run :: 17715: before unlock adviser
2021-07-22 09:34:22 +0200 :: 4545776|171852 :: 8224112-16777216-14125307 :: AdviserCall->run :: 17715 : before calculate billing


but in fact, nothing serious has been dug up yet.
I can’t find the moment where there is a serious memory leak, but on measurements I received almost one and a half gigabytes in RAM for the process for 10 days, and once I even found 13 gigabytes in virtual memory

UPD (2021-08-23): I found the direction of the leak search, as well as an interesting bug,
the bug is expressed as follows - when working with a model in mongo, at a certain moment I change the status attribute from EXECUTING to EXECUTED, and in the after_save method of this model I reset the object to the log. and in about 1% of cases, the object had an EXECUTED status in the log, but at the same time in the database I see only EXECUTING

as for leakage - it is formed due to flows. at the time of launching the application, I have 38 threads, after 5 days I already have about 4 hundreds of them, and the memory grows proportionally with them. Found one of the potential problems. In a complex session, when a client creates an incoming call, I create an operator's leg and connect them, after which the operator determines by pin during the conversation which expert will be dialed and the operator's leg will be replaced by the expert's leg. in the process there is a bunch of other actions, including playing greetings / music to each his own. In certain situations, I get an error message in the log like
[2021-08-23 18:27:37.909] ERROR Adhearsion::OutboundCall: [email protected]: <Adhearsion::Call::Hangup> Adhearsion::Call::Hangup
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/responses.rb:29:in `value'
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/calls.rb:92:in `value'
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/proxies/sync_proxy.rb:33:in `method_missing'
  /usr/local/rvm/gems/ruby-2.3.0/gems/adhearsion-2.6.3/lib/adhearsion/call.rb:22:in `method_missing'
  /usr/local/rvm/gems/ruby-2.3.0/gems/matrioska-0.3.0/lib/matrioska/app_runner.rb:27:in `start'
  /usr/local/rvm/gems/ruby-2.3.0/gems/matrioska-0.3.0/lib/matrioska/app_runner.rb:80:in `block in match_and_run'
  /usr/local/rvm/gems/ruby-2.3.0/gems/adhearsion-2.6.3/lib/adhearsion/call_controller.rb:112:in `exec_with_callback'
  /usr/local/rvm/gems/ruby-2.3.0/gems/adhearsion-2.6.3/lib/adhearsion/call_controller.rb:104:in `block (2 levels) in bg_exec'
  /usr/local/rvm/gems/ruby-2.3.0/gems/adhearsion-2.6.3/lib/adhearsion/foundation/exception_handler.rb:5:in `catching_standard_errors'
  /usr/local/rvm/gems/ruby-2.3.0/gems/adhearsion-2.6.3/lib/adhearsion/call_controller.rb:103:in `block in bg_exec'
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/thread_handle.rb:13:in `block in initialize'
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/internal_pool.rb:100:in `block in create'
  /usr/local/rvm/gems/ruby-2.3.0/gems/logging-1.8.2/lib/logging/diagnostic_context.rb:323:in `block in create_with_logging_context'

at the same time, there is no mention of the application file itself as the source of the error, as a result, only the notion that some thread in the celluloid failed and could not exit

Answer the question

In order to leave comments, you need to log in

3 answer(s)
A
AVKor, 2020-12-10
@AVKor

Here , for example.

R
Roman Mirilaczvili, 2021-01-30
@2ord

Usually these are not memory leaks, but memory bloat due to thoughtless programming: reading a huge JSON entirely into memory, reading all records from the database into memory ( Product.all), etc.
There is a gem that I propose to start the investigation with: derailed_benchmarks . With it, you can determine how much memory the application takes at startup, including the RoR framework. See how to connect to Sinatra.
bundle exec derailed bundle:mem
bundle exec derailed bundle:objects
This is how you can track memory consumption over time:
bundle exec derailed exec perf:mem_over_time
scout_apm - with it you can determine which operations lead to an increase in application memory consumption.
When investigating on a local machine, it is worth considering that if you work with local test data samples, then the memory consumption may be significantly less due to the difference in data volume.
Added:
Found another good comparison of performance monitoring tools, to load:
Rails Performance Monitoring Tools Compared
Added:
Another easy way to reduce memory consumption:
Add
gem 'jemalloc'

3
3ton, 2021-08-31
@3ton

Made code optimization and reduced the number of entries in the same document.
Now the number of Threads continues to grow, but the memory keeps itself normal
True, misunderstandings happen and the code terminates the Thread without allowing all the code in call.on_end to work out
At the same time, errors like

[2021-08-31 18:15:31.879] ERROR Adhearsion::OutboundCall: [email protected]: <Celluloid::Task::TerminatedError> task was terminated
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:32:in `terminate'
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:404:in `block in cleanup'
--
[2021-08-31 18:16:32.133] ERROR Adhearsion::Call: [email protected]: <Celluloid::Task::TerminatedError> task was terminated
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:32:in `terminate'
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:404:in `block in cleanup'
--
[2021-08-31 18:17:38.997] ERROR Adhearsion::Call: [email protected]: <Celluloid::Task::TerminatedError> task was terminated
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:32:in `terminate'
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:404:in `block in cleanup'
--
[2021-08-31 18:18:58.485] ERROR Adhearsion::Call: [email protected]: <Celluloid::Task::TerminatedError> task was terminated
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:32:in `terminate'
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:404:in `block in cleanup'
--
[2021-08-31 18:20:57.680] ERROR Adhearsion::OutboundCall: [email protected]: <Celluloid::Task::TerminatedError> task was terminated
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:32:in `terminate'
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:404:in `block in cleanup'
--
[2021-08-31 18:22:23.770] ERROR Adhearsion::OutboundCall: [email protected]: <Celluloid::Task::TerminatedError> task was terminated
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:32:in `terminate'
  /usr/local/rvm/gems/ruby-2.3.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:404:in `block in cleanup'

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question